Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips...

36
Thermal Modeling and Management Thermal Modeling and Management for 3D MPSoCs with Active Cooling P fD id Ati Al Prof. David Atienza Alonso Embedded Systems Laboratory (ESL) Institute of EE, Faculty of Engineering © ESL/EPFL 2010 ARTIST Summer School 2010, Autrans (France)

Transcript of Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips...

Page 1: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal Modeling and ManagementThermal Modeling and Management for 3D MPSoCs with Active Cooling

P f D id Ati AlProf. David Atienza AlonsoEmbedded Systems Laboratory (ESL) y y ( )Institute of EE, Faculty of Engineering

© ESL/EPFL 2010

ARTIST Summer School 2010, Autrans (France)

Page 2: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Advantages of 3D vs. 2D Chips

Promises • Reduce average length of on-chip global wires• Reduce average length of on-chip global wires• Increase number of devices

reachable in given time budget

• Greatly facilitate heterogeneous integration (e.g. logic-DRAM stacks)(e.g. logic DRAM stacks)

© ESL/EPFL 20102

Samsung Wafer Stack Package (WSP) memory

[Figures: Ray Yarema, Fermilab]

Page 3: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal-Reliability Issues in 3D Chips

Latest chips increase power densityp p y Non-uniform hot-spots in 2D chips [Sun,

1.8 GHz Sparc v9 In 3D chips heat affects several p

Microproc]In 3D chips, heat affects several

layers! (even more “cool” components)

[Sun, Courtesy: [ ,Niagara

BroadbandProcessor]

[IBM and

Irvine Sens.]

© ESL/EPFL 2010 3

Page 4: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal-Reliability Issues in 3D Chips

Latest chips increase power densityp p y Non-uniform hot-spots in 2D chips [Sun,

1.8 GHz Sparc v9 In 3D chips heat affects several p

Microproc]In 3D chips, heat affects several

layers! (even more “cool” components)

[Sun, Courtesy:Higher chances

of thermal [ ,Niagara

BroadbandProcessor]

[IBM and

Irvine Sens.]

wear-outs and

hvery short lifetimes!

© ESL/EPFL 2010 4

Page 5: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Run-Time Heat Spreading in 3D Chips

8000

9000

10000 Layer 2

420 5-tier 3D stack: 10 heat sources and sensors

2000

3000

4000

5000

6000

7000

8000

wid

th (u

m)

400

405

410

415

Inject between 4W – 1.5W

0 2000 4000 6000 8000 100000

1000

length (um)

7000

8000

9000

10000 Layer 3

404

406 2nd Tier

1000

2000

3000

4000

5000

6000

7000

wid

th (u

m)

396

398

400

402

0 2000 4000 6000 8000 100000

length (um)

394

5000

6000

7000

8000

9000

10000 Layer 4

dth

(um

)

397

398

399

400

401

3rd Tier5th Tier

0 2000 4000 6000 8000 100000

1000

2000

3000

4000

length (um)

wid

393

394

395

396

5000

6000

7000

8000

9000

10000 Layer 5

h (u

m)

394

395

396

397

Large and non-uniform

© ESL/EPFL 20100 2000 4000 6000 8000 10000

0

1000

2000

3000

4000

length (um)

wid

th

390

391

392

393 4th Tier heat propagation! (up to 130º C on top tier) 5

Page 6: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

NanoTera CMOSAIC Project: Design of 3D MPSoCs with Advanced Cooling3D MPSoCs with Advanced Cooling

3D systems require novel electro-thermal co-design• Academic partners: EPFL and ETHZ• Academic partners: EPFL and ETHZ• Industrial: IBM Zürich

© ESL/EPFL 20106

Page 7: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

NanoTera CMOSAIC Project: Design of 3D MPSoCs with Advanced Cooling3D MPSoCs with Advanced Cooling

3D systems require novel electro-thermal co-design• Academic partners: EPFL and ETHZ

3D stacked MPSoC chips: microchannels etched on

• Academic partners: EPFL and ETHZ• Industrial: IBM Zürich

pback side to circulate liquid coolant

S tSystem Level Active

C liadjustment Cooling Manager

adjustmentof coolant flux

(3D heat flow

task scheduling andexecution control

© ESL/EPFL 2010

prediction)7

Page 8: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Outline

IntroductionIntroduction 3D chip thermal modeling framework Validation of 3D thermal model Validation of 3D thermal model Liquid cooling modeling

Li id li d l lid ti Liquid cooling model validation Close-loop 3D MPSoCs thermal management with

ti liactive cooling Experiments and conclusions

© ESL/EPFL 2010 8

Page 9: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Compact RC-Based Tier Thermal Model

Gate-level thermal model qbTbq 6

qb_top qb_backRC Network of

qbTbq jf

ffjbj

01

q qb_right

qb_left

Si/metal layer cells qb_bottomqb_front

Convective boundary conditions

cells

qb_top = htopA(Ta-Ttop)

2D tier modeled as heat flux moving between adjacent cells

Convective boundary conditions between layers in tier

I•

(qbi)I

I+1•

(qbi)I+1

f i

I-1•

© ESL/EPFL 20109

qb_bottom = hbottomA(Ta-Tbottom)face i

[Atienza et al., TODAES 2007]

Page 10: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Complete 3D Chip Thermal Modeling

Multi-level execution for thermal convergence in 3D • Local (2D-tier), liquid channels and global (3D) propagation

Evaluate local temperature for each cell

erge

nce

ns

vel c

onve

N it

erat

ion

Feedbacktemperature Update with neighbour

Tier

-levN temperature Upda e e g bou

temperature difusion

Go to next tier or microchannel

© ESL/EPFL 201010[Ayala et al., NanoNet 2009]

Page 11: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Chip Thermal Library Validation

Extensible set of layers in 3D stackExtensible set of layers in 3D stack• up to 9 tiers and heat spreader• Pre-defined layers:

Sili (10 l ) l Silicon, copper (10 layers), glue, overmold, interposer, bump

Configurable nr. of cells and iterations per tier• Initially 10ms thermal interval (1000 iterat./tier)

Multi-tier test chip manufactured at EPFL:

© ESL/EPFL 2010 11

Page 12: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Chip Thermal Library Validation

Extensible set of layers in 3D stackExtensible set of layers in 3D stack• up to 9 tiers and heat spreader• Pre-defined layers:

Sili (10 l ) l Silicon, copper (10 layers), glue, overmold, interposer, bump

Configurable nr. of cells and iterations per tier• Initially 10ms thermal interval (1000 iterat./tier)

Multi-tier test chip manufactured at EPFL:• Three types of tiers• Three types of tiers

© ESL/EPFL 2010 12

Page 13: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Library Validation: Creating Various 3D Thermal MapsCreating Various 3D Thermal Maps

Flexibility for thermal characterizationy

© ESL/EPFL 2010 13

Page 14: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Library Validation: Creating Various 3D Thermal Maps

Flexibility for thermal characterization

Creating Various 3D Thermal Maps

y

© ESL/EPFL 2010 14

Page 15: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Library Validation: Creating Various 3D Thermal MapsCreating Various 3D Thermal Maps

Flexibility for thermal characterizationy

10 heat sources and sensors per layer, ibl t b i lt l ti t d

© ESL/EPFL 2010

accesible to be simultaneously activated

15

Page 16: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Library Validation: Correlation with 5-Tier 3D StackCorrelation with 5 Tier 3D Stack

3D Chi EPFL L 3 h t i ti

29303132

e (m

V)

3D Chip, EPFL, Layer 3 characterizationBlue Curve: 3D current -heat model for D8Pink curve: Heater current measured in D8

Dev8D7HD8S

242526272829

0 200 400 600 800 1000 1200

Sens

or V

olta

g

0 200 400 600 800 1000 1200Heater Current (mA), applied to Dev 7

3D Chip EPFL multi tier characterization

343638

mV)

3D Chip, EPFL, multi-tier characterizationBue/Pink Curve: D7 (tier 1) and D8 (tier 4)Red Curve: 3D current-heat model for D8

Dev6D 7

2628303234

Sens

or V

olta

ge ( Dev7

Div6_Iheat7

© ESL/EPFL 2010

240 200 400 600 800 1000 1200 1400

Heater Current (mA), applied to Dev 7

16[Ayala et al., Nano-Nets ’09]

Page 17: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Library Validation: Correlation with 5-Tier 3D StackCorrelation with 5 Tier 3D Stack

3D Chi EPFL L 3 h t i ti

29303132

e (m

V)

3D Chip, EPFL, Layer 3 characterizationBlue Curve: 3D current -heat model for D8Pink curve: Heater current measured in D8

Dev8D7HD8S

242526272829

0 200 400 600 800 1000 1200

Sens

or V

olta

g

0 200 400 600 800 1000 1200Heater Current (mA), applied to Dev 7

3D Chip EPFL multi tier characterization

Variations of less than 1.5% between 3D stack measurements and new 3D thermal model

343638

mV)

3D Chip, EPFL, multi-tier characterizationBue/Pink Curve: D7 (tier 1) and D8 (tier 4)Red Curve: 3D current-heat model for D8

Dev6D 7

2628303234

Sens

or V

olta

ge ( Dev7

Div6_Iheat7

© ESL/EPFL 2010

240 200 400 600 800 1000 1200 1400

Heater Current (mA), applied to Dev 7

17[Ayala et al., Nano-Nets ’09]

Page 18: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Modeling Through Silicon Vias (TSVs) in 3D Stacksin 3D Stacks

TSVFigure: LSM-EPFL

TSVs:• Size: 5-10um x 10-100um• TSVs change resistivity of interlayer material (IM)

Modeling Granularities:1. Homogeneous

distribution, one R ,value for the IM

2. Different R value per unit (core cache etc )unit (core, cache, etc.)

3. Exact locations of TSVs

Hi h Source: IBM Zürich and

© ESL/EPFL 2010

• Higher accuracy• Higher complexity

18

Source: IBM Zürich and Y.Heights

Page 19: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

TSV Modeling Accuracy in 3D Stacks

Chosen to model TSV groups in localized positions

© ESL/EPFL 2010 19

Chosen to model TSV groups in localized positions of 3D MPSoCs

Page 20: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Liquid Flux Model for Laminar Flowq

Local junction temperature modeled as RC t knetwork:

Rtot = Rcond + Rconv + Rheat Thermal resist.

Heat source

Rtot = 1/(Gsi/t + 1/Rb) + A/(bAt) + A/(VPcp) of Si

Chip back-side temperatureSi b Heating Fl t d

Thermal

temperatureSi base thickness

Heating area Total area

Flow rate and density

Dependence of thermal resistance in liquid Thermal resistance of

wiringq2

Dependence of thermal resistance in liquid flux modeled as a quadratic form• Variable value of coolant flux (Φ)

T1

P P

Variable value of coolant flux (Φ)∆Rheat ≈ aΦ + bΦ2 ; b << a

© ESL/EPFL 2010q1

P1

P2

[Atienza et al., THERMINIC ’09 and DATE ‘10] 20

Page 21: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

3D Thermal Model with Liquid Cooling

New set of layers in 3D stack3D t k ( t 9 ti )• 3D stack (up to 9 tiers)

• 1 microchannel and coolant flow per tier

5 ti t k ith i h l d if ld 5-tier stack with microchannels and manifold cooling seal manufactured at IBM/EPFL• Enables different multi-tier liquid flux injection

Micro-HeaterLiquid

j

PCB Micro-Channels

© ESL/EPFL 2010 21

Source: IBM & ESL, EPFL

Page 22: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Manufacturing of 5-Tier 3D Test Chip with Liquid Channels in Multiple TiersLiquid Channels in Multiple Tiers

Front-sideBack-side

Figure: IBM & ESL, EPFL

Adding multi-tier liquid cooling in-/out-lets Multi-tier active cooling technology feasible for

© ESL/EPFL 2010 22

technology feasible for 3D-stacked chips

Page 23: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Correlation Results: Liquid Cooling and 3D Heat TransferLiquid Cooling and 3D Heat Transfer

Temperature evolution at the junction (Tj)

q 1q 2 P2

• Tested range: 0.015 to 0.15 L/min • Similar accuracy results at different channels

qq

T 1 P1

Avg Max temp Error= 0.6%

© ESL/EPFL 2010 23[Atienza et al., THERMINIC ’09]

Page 24: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Correlation Results: Liquid Cooling and 3D Heat Transfer

q 1q 2 P2

Liquid Cooling and 3D Heat Transfer Temperature evolution at the junction (Tj)

qq• Tested range: 0.015 to 0.15 L/min • Similar accuracy results at different channels

T 1 P1Variations of less than 1% between measurements

and RC-based 3D thermal model with liquid cooling

Avg Max temp Error= 0.6%

© ESL/EPFL 2010 24[Atienza et al., THERMINIC ’09]

Page 25: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Complete 3D Chip Thermal Modeling Flow with Liquid Coolingwith Liquid Cooling

Inputs:Inputs:

• Workload information• Floorplan, TSV areas, package

( )

Inputs: • Workload information

• Activity of cores

Scheduler (Reactive Proactive)

• Temperature (for dynamic policies) Power Manager (DPM)

Inputs: • Power trace for each unitScheduler (Reactive, Proactive) Power trace for each unit

• Floorplan, package and die properties (Niagara-1), TSV area percentage/distribution

• Flow rate

3D Thermal Simulator w. Liquid Cooling based on EPFL-IBM 3D chips

(Integrated within internal HotSpot tool version)

Transient Temperature Response

© ESL/EPFL 2010

Transient Temperature Response for Each Unit

25

Page 26: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Run-Time HW/SW Thermal Modeling Framework for 3D Chips

Multi Proc OS + DVFS + Task Migration

Framework for 3D Chips Exploitation of both hardware and software benefits

I/O

SRAM SRAM

CPU

SRAM

CPUMulti-Proc. OS + DVFS + Task Migration

Sw app 1 ... Sw app NZeroZero--delaydelayMPS CMPS C

sniffer

sniffersniffer

iff

sniffer

SRAM SRAM

I/O

SRAM

CPUCPUMPSoCMPSoC

architecturearchitecturesimulationsimulation

Energy of 2D componentssniffer

sniffer

sniffersniffer

sniffersnifferMPSoC Behavior

Emulation on FPGA

simulationsimulation p

Temp. (T) of 2D components

standardstandard Ethernet Ethernet connectionconnection & & dedicateddedicated HW monitorHW monitorDetailedDetailed components

Software Thermal

si si si sicu cucucucu

thermalthermalanalysisanalysis of of 2D2D MPSoCMPSoC

© ESL/EPFL 2010

Model Host PC

si sisi

sisi

sisi

sisi2D 2D MPSoCMPSoC

layoutlayout26[D. Atienza et al., TODAES 2007]

Page 27: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Run-Time HW/SW Thermal Modeling Framework for 3D ChipsFramework for 3D Chips

Multi Proc OS + DVFS + Task Migration Exploitation of both hardware and software benefits

I/O

SRAM SRAM

CPU

SRAM

CPUMulti-Proc. OS + DVFS + Task Migration

Sw app 1 ... Sw app NZeroZero--delaydelayMPS CMPS C

sniffer

sniffersniffer

iff

sniffer

SRAM SRAM

I/O

SRAM

CPUCPUEnergy of 3D components

MPSoCMPSoCarchitecturearchitecturesimulationsimulation sniffer

sniffer

sniffersniffer

sniffersnifferMPSoC Behavior

Emulation on FPGA

componentssimulationsimulation

standardstandard Ethernet Ethernet connectionconnection & & dedicateddedicated HW monitorHW monitor

Temp. of3D components

3D StackThermal

Nth Tier

© ESL/EPFL 2010

Model1st Tier Host PC

[D. Atienza, THERMINIC 2009] 27

Page 28: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal Management for 3D-MPSoCs with Liquid Coolingwith Liquid Cooling

Active-Adapt3D: Combined policy manager (B t P A d t IEEE/IFIP VLSI S C 2009)(Best-Paper Award at IEEE/IFIP VLSI-SoC 2009)• Predictive, floorplan-based task assignment and DVFS

Cl l i bl li id li t l• Close-loop variable liquid cooling control T ≥ 80°C Increment flow rate ; T < 80°C Decrement P li b li d ti l ti l Policy can be applied reactively or proactively

Thermal SensorsSystem Temperature 

Flow Rate

Temperature Measurements

pDynamics

REACTIVE

ARMA‐Based 

Flow Rate Tuner

Temperature 

© ESL/EPFL 2010 28

Predictor Forecast

PROACTIVE

Page 29: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Adaptive Thermal-Aware Task Assignment Policy for 3D MPSoCsAssignment Policy for 3D MPSoCs

Cores on layers closer to the heat sink can be cooled faster in comparison to cores further away

Adapt-3D assigns a thermal index ( ) to each core in order to distinguish the location of the cores• Higher Core more prone to hot spotsi

Higher Core more prone to hot spotsi

For cores at locations 1, 2 and 3:

Chip 2 2

3

321

Chip‐1

Chip‐2 2

© ESL/EPFL 2010

Chip‐1

29[Coskun and Atienza, DATE ‘09]

Page 30: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Adaptive Thermal-Aware Task Assignment Policy for 3D MPSoCsAssignment Policy for 3D MPSoCs

WPP tt 1Probability of receiving workload at time t:

preferredavginitinc TTW if1

Weight:Cool core

preferredavgiinitdec

preferredavgi

initinc

TTWW

if

Hot coreFor each core

TTWEmpirical avgpreferredinit TTW

Measured by sensors

pconstants

© ESL/EPFL 2010

E.g., 80oCMeasured by sensors

30

Page 31: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Experiments 3D Thermal Management: 3D MPSoCs with Microchannels

Target 3D systems based on 3D version Sun UltraSPARC T1 P l d kl d f l t d i S

3D MPSoCs with Microchannels

• Power values and workloads from real traces measured in Sun platforms (multimedia players, web servers, databases, etc.)

Cores and caches in separate layersCores and caches in separate layers

Channels:Width 400umWidth 400um, Depth 250um. Four flow rate

© ESL/EPFL 2010

settings, default at 15ml/min.

31(EXP1-2) (EXP3) (EXP4)

Page 32: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal Management for 3D Chips: Active-Adapt3D ComparisonsActive Adapt3D Comparisons

Predictive task scheduling, active cooling and floorplan-Predictive task scheduling, active cooling and floorplanaware DVFS achieves less than 5% hotspots

© ESL/EPFL 2010

Promising figures for thermal control in 3D-MPSoCs32[Coskun and Atienza, DATE ‘10]

Page 33: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Thermal Management in 3D Chips: Active-Adapt3D ComparisonsActive Adapt3D Comparisons

Variable multi-tier flow control useful for 3D systems with 3+ layersVariable multi tier flow control useful for 3D systems with 3+ layers. Proactive thermal management achieves:

• 75% reduction in spatial gradients on average -- for fixed flow ratep g g• 97% reduction in spatial gradients on average -- for variable flow rate

*LC: Multi-tier variable liquid

cooling

© ESL/EPFL 2010 33

Cooling power savings up to 67% to worst-case flux [Coskun and Atienza, DATE ‘10]

Page 34: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Conclusions

Complexity of coming 3D MPSoC chips requires novelthermal modeling approachesthermal modeling approaches• Application of simple RC-based methods demonstrated,

validated with 3D test chip• Initial model of liquid cooling channels in 3D chips

Simple RC laminar flow model, works well with variable liquidfluxes (errors of less than 2%)fluxes (errors of less than 2%)

Integrated the compact model into custom HotSpot tool

New thermal management: feedback controller adjusts flow New thermal management: feedback controller adjusts flow rate to allowed temperature with job assignment and DVFS• Proactive control improves the hot spot reduction to 95% forProactive control improves the hot spot reduction to 95% for

systems with variable flow rates, and reduces thermal variations• Dynamic flow rate adjustment is helpful in reducing the energy cost

f th d ll t (67% i )

© ESL/EPFL 2010

of the pump and overall system (67% power savings)

34

Page 35: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Key References and Bibliography

3D Thermal modeling and FPGA-based emulation• “3D-ICE: Compact transient thermal model for 3D ICs with liquid cooling via

enhanced heat transfer cavity geometries”, A. Sridhar, et al. Proc. of ICCAD 2010,USA, November 2010.

• “Transient Thermal Modeling of 2D/3D Systems-on-Chip with Active Cooling”,David Atienza, Proc. of THERMINIC 2009, Belgium, October, 2009.

Thermal management for 3D MPSoCs• “Fuzzy Control for Enforcing Energy Efficiency in High-Performance 3D

Systems”, M. Sabry, Ayse K. Coskun, David Atienza, Proc. of ICCAD 2010, USA,November 2010.

• “Energy-Efficient Variable-Flow Liquid Cooling in 3D Stacked Architectures”,Ayse K. Coskun, David Atienza, et al., Proc. of DATE 2010, Germany, March 2010.

• “Modeling and Dynamic Management of 3D Multicore Systems with LiquidC C f S S C OCooling”, Ayse K. Coskun, et al., Proc. of VLSI-SoC 2009, Brazil, October 2009.(Best Paper Award)

• “Dynamic Thermal Management in 3D Multicore Architectures”, Ayse K. Coskun,f

© ESL/EPFL 2010

et al., Proc. of DATE 2009, France, April 2009.

Page 36: Thermal Modeling and ManagementThermal Modeling and ... · Thermal-Reliability Issues in 3D Chips Latest chippp ys increase power density Non-uniform hot-spots in 2D chips [Sun, 1.8

Nano-Tera.ch Swiss Engineering Programme

European

QUESTIONS ?

pCommission

QUESTIONS ? Swiss National Science Foundation

© ESL/EPFL 2010