What computer architects need to know about memory throttling WEED 2010 June 20, 2010

12
© 2010 IBM Corporation What computer architects need to know about memory throttling WEED 2010 June 20, 2010 IBM Research – Austin Heather Hanson Karthick Rajamani

description

IBM Research – Austin Heather Hanson Karthick Rajamani. What computer architects need to know about memory throttling WEED 2010 June 20, 2010. Outline. Memory throttling overview Experimental platform System configuration Memory throttling implementation - PowerPoint PPT Presentation

Transcript of What computer architects need to know about memory throttling WEED 2010 June 20, 2010

© 2010 IBM Corporation

What computer architects need to know about memory throttling

WEED 2010June 20, 2010

IBM Research – Austin

Heather Hanson

Karthick Rajamani

© 2010 IBM Corporation2

Outline

Memory throttling overview

Experimental platform– System configuration– Memory throttling implementation

Memory throttling characterization– Bandwidth– Power– Performance

Summary

© 2010 IBM Corporation3

Memory throttling in a nutshell

Memory throttling is a power-performance knob that:– Impacts memory reference rates of both instruction and data streams– controls power – can be used for safety or optimization

• regulate DIMM temperatures• enforce memory power budgets

Memory throttling restricts read & write traffic – directly controls memory power– indirectly affects processors and other components

Several implementation styles in commercial systems– insert periodic idle cycles– allow arbitrary number of transactions up to power (estimated) threshold– run + hold windows– enforce read & write quotas [this paper]

• first N transactions to proceed in time window• any further requests wait until next time period

© 2010 IBM Corporation4

Comparison to clock throttling

run-hold clock throttlingregular frequency during run portion;

clock halted during hold portion

quota-style memory throttlingreads & writes proceed as requested up

to N requests per period

Example: N = 6Up to 6 transactions serviced per

period, regardless of request timing

Nth request in each period;additional requests would be queued for later service

© 2010 IBM Corporation5

POWER6 Memory Throttling

IBM JS12 blade system– Processor

• POWER6• 1 socket x 2 cores per processor socket• 3.8 GHz frequency (fixed in these experiments)• SLES10 linux

– Memory•16 GB capacity• 8 DIMMS x 2 GB each• DDR2• 667 MHz bus

Quota-style memory throttling– N transactions per M memory cycles

100% throttle level == unthrottled

– Time period is faster than thermal and power supply timescales

© 2010 IBM Corporation6

Memory throttle characterization methodology

1. Sweep throttle settings

• Set throttle

• Run steady-behavior benchmarkDAXPY (double A * X plus Y)FPMAC (floating-point multiply accumulate)RandomMemory (generate random addresses)SPECPower_ssj2008 calibration phase (peak throughput for warehouse transactions)

• Record sensor data, 256ms per sampleMemory powerMemory reads & writesInstruction throughputAnd other sensors not shown here

• Decrement throttle

• Repeat for full range of throttle settings

2. Repeat throttle sweep for multiple benchmarks and memory footprints– Microbenchmarks: L1 cache contained and main memory footprints– SPECPower_ssj2008: behaves as nearly contained in on-chip caches

3. Calculate median sensor data for each permutation {benchmark, footprint, throttle}

© 2010 IBM Corporation7

saturated

linear

Memory throttle effect on bandwidth

transition between linear & saturated regions

© 2010 IBM Corporation8

A closer look at RandomMemory-DIMM

• uses less bandwidth than other benchmarks at same throttle levels• also less bandwidth than its own saturation level

Simply measuring bandwidth at a single/current throttle level is not enough to identify a region of operation less than max could be saturated or transition region

….a controller will not be able to accurately predict the effect on bandwidth of a throttle level change

…or predict the effect on power or performance

Subtle but very important point about transition region

Actual bandwidth < max bandwidth bandwidth restrictions

pipeline starvation reduced request rate

© 2010 IBM Corporation9

Memory Power

is basically linear with bandwidth, so this chart looks familiar….

© 2010 IBM Corporation10

power performance

Throttling effects relative to each benchmark

L1-contained DAXPY: throttling has no effect

DIMM-sized DAXPY: drastic effect

Generally more performance reduction than power reduction (in %)– Throttling alone doesn’t affect static portion of memory power

• Leveraging idle low-power modes of memory can alter positively the power-performance curve for memory request rate throttling.– Possible to waste energy from longer execution time

Larger bandwidth demands larger effect from throttling– Conversely, power reduction only when performance is impacted.

© 2010 IBM Corporation11

Summary

Memory throttling is a power-performance knob available in commercial systems

Memory controller restricts read & write bandwidth– caps memory power– controls DIMM temperature

Mileage may vary– power and performance management depend on bandwidth demand

• throttling a low-bandwidth workload doesn’t reduce much power

– potential to use more energy due to increased execution time• use highly throttled settings with caution

Effective tool for power capping– power constrained configurations– thermal safety– power shifting

© 2010 IBM Corporation12

Acknowledgements

IBM Research – Austin

IBM Systems & Technology Group– Memory characterization: Joab Henderson, Kenneth Wright– EnergyScale firmware: Guillermo Silva, Andrew Geissler