VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

23
VRCon: Dynamic Reconfiguration of Voltage Regulators in a Multicore Platform Woojoo Lee, Yanzhi Wang, and Massoud Pedram University of Southern California

Transcript of VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

Page 1: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

VRCon: Dynamic Reconfiguration of Voltage Regulators in a

Multicore Platform

Woojoo Lee, Yanzhi Wang, and Massoud PedramUniversity of Southern California

Page 2: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Introduction• Preliminary - VR characteristics• Dynamic reconfiguration of the VR-to-core

network– Proposed multicore platform– Reactive VRCon– Proactive VRCon

• Experimental work• Conclusion

Outlines

Mar-27-14 2

Page 3: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Per-chip DVFS vs. Per-core DVFS– (Conventional) per-chip DVFS hinders DVFS from achieving

its full potential.– Per-core DVFS allows excellent flexibility in controlling

power, but has shortcomings from the indispensable use of multiple voltage regulators (VRs), such as footprint, power conversion loss, and control complexity.

• We target the multicore platforms that support the per-core DVFS.

Introduction (1/3)

Mar-27-14 3

Page 4: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• We focus on the power conversion efficiency of the multiple VRs.– The figure below shows traces of the VR efficiency during

delivering power to a core. Around 24% of input power is dissipated by a single VR in the high efficiency region, but more than 53% of the input power is consumed by the VR in the low efficiency region.

– Power dissipations of all VRs can result in a considerable amount of power loss.

Introduction (2/3)

Mar-27-14 4

2 2.5 3 3.5 4 4.5 5 5.5 6x 104

0

20

40

60

80

0 10 20 30Time (ms)

0

20

40

60

80

Effic

ienc

y (%

)

Mean: 75.18(%)

40

Mean: 46.38(%)

5 15 25 35

Page 5: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• We propose a system-level optimization technique to substantially improve the VR efficiency: VR consolidation (VRCon for short).– This technique starts from the intuition of combining some

cores, which require the same voltage level and driving small amount of load current, to be powered by a single VR.

– Why is this helpful? We will see the reasons from the VR characteristics, in the following slides.

– We present two VRCon techniques, a reactive and a proactive VRCon.

Introduction (3/3)

Mar-27-14 5

Page 6: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• We targets inductive switching regulators. – The inductive switching regulators achieve the higher

conversion efficiencies over a wide range of output loads, compared to other types of VRs, such as low-dropout regulators and switched-capacitor regulators.

– Due to the equipped controller to support dynamic voltage setting with fast transient response, the inductive switching regulator is suitable to power the processors.

– The circuit schematics is in the below:

VR characteristics (1/3)

Mar-27-14 6

Page 7: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• The load current condition of the VR affects the VR efficiency – The figure below shows load current vs. efficiency, simulated by

the VR schematics and 45nm PTM. The main source of the power loss for Region I is the switching and controller losses, Region II is the conduction loss.

– Modern VRs exhibits high peak efficiency with a specific load current value, but their efficiency drops dramatically under the adverse load current conditions.

VR characteristics (2/3)

Mar-27-14 7

Page 8: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• VRCon is motivated to save power by configuring the VR-to-core network to use a single VR instead of multiple VRs, if available. – If some cores in a multicore processor require the same

voltage level, and they have small load currents, then their power domains can be consolidated to share a single VR.

– Then, the VR used to power multiple cores has relatively high load current, and hence, higher efficiency.

– The VRs that are not used can be turned off to save power.

• Now, let’s go into the detail of VRCon!

VR characteristics (3/3)

Mar-27-14 8

Page 9: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• The proposed platform has a several components

– Power manager (PM) monitors the core status (i.e., performance) reported by hardware performance monitor (HPM). Different from PMs in conventional multicore platforms, PM here determines a tentative voltage and frequency levels of cores, and transmits this information to VRCon manger.

Proposed platform (1/2)

Mar-27-14 9

– Network switches is to implement the reconfigurable the VR-to-core network.

– VRCon manager (VRCM) is added to ultimately controls the core’s frequency/voltage level, as well as the operations of VRs and ON/OFF states of the network switches in VRCon

Page 10: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• The figure below is a conceptual diagram of the proposed multicore platform

Proposed platform (2/2)

Mar-27-14 10

DVFSopinion

DVFS setup

Core5

Core8

Core1

Core4

VR groups

..

Multi-core processor (per-core DVFS)

VR output setup

VRConManager

Hardware Performance

Monitor

Dynamic Config.

.. .. ..Switch set 1 Switch set 2 Switch set 3

Power Manager

Sensing circuits

.. .. Core9

Core12

..

.. ..VR 1

VR4

VR5

VR8

VR9

VR-to-core distribution network

Page 11: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• The power saving achieved by employing DVFS strongly depends on the frequency of the decision making process.– Equivalently, it is the duration of decision period ( ).– should be considered a design variable to be set by the

PM, which needs to be (much) longer than the voltage scaling time of the VR.

• Turning on/off the network switches, the time to reconfigure the VR-to-core network ( ) is only limited by the transient response of the VR.– It is in general much shorter than the voltage scaling time.

• We treat the DVFS setting and network reconfiguration as the global and local power managements of VRCon. – and are the required minimum global and local

decision epoch lengths, respectively.

VRCon: overview (1/7)

Mar-27-14 11

TDVFS

TDVFS

TNS

TDVFS TNS

Page 12: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• As a local management function, the reactive VRCon applies only to cores with the same voltage level.– The figure below shows an example of applying the reactive

VRCon to a dual core platform.

VRCon: reactive VRCon (2/7)

Mar-27-14 12Yanzhi Wang/ University of Southern California

Vdd

0.750.83

0.951.05

1.2

012345

Volta

ge (V

)

0.750.830.951.051.20

135Current

0.750.83

0.951.05

1.2

0123456

Vdd

0.750.830.951.051.20

Current

135

Cur

rent

(A)

Time

Cur

rent

(A)

Volta

ge (V

)

is a valid region for VRCon, is not, because of the high load current.

Page 13: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• (cont.)– The VRCM in this case performs only the network switch control

to minimize the total energy consumption. – The total energy consumption is the summation of energy losses

of the active VRs (including network switches) and the energy consumptions of the cores during the time period .

• Algorithm for reactive VRCon.– The VRCM first sorts the cores that have the same voltage levels

and a lower amount of load current than the maximum driving capability of a single VR.

– The VRCM finds the two cores, by merging which the VR energy saving is maximized. The merged cores are treated as one core.

– The VRCM keeps repeating the above procedure until there is no available core.

VRCon: reactive VRCon (3/7)

Mar-27-14 13

TDVFS

Page 14: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• For its global power management function, the proactive VRCon exploits DVFS technique to perform frequency (and its corresponding voltage level) scaling.– The proactive VRCon takes account for the energy consumption

of both cores and VRS.– There can be a trade-off between the energy saving by DVFS

(which is initially determined by the PM), and reduced energy loss by adaptively turning off the VRs and using fewer number of VRs at higher conversion efficiencies.

– If the VRCM determines that the latter option is better, the VRCM will not decrease the frequency/voltage levels of some cores to the minimum level possible; Instead it will adjust the frequency/voltage levels of the cores to increase the chances for applying the VRCon.

VRCon: proactive VRCon (4/7)

Mar-27-14 14

Page 15: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• The objective here is to find the frequency/voltage level of each core for each TDVFS to minimize the total energy consumption.

– denotes the total energy consumption during the time period of ; is core’s voltage level.

– is the total number of cores; indicates that all the task processings are finished in this period

• Solving the objective is difficult, because:– changing in time period affects the VRCon results

in period .– There are locking and synchronization issues of the multi-

thread applications in multi-core processors.

VRCon: proactive VRCon (5/7)

Mar-27-14 15

min

TX

t=1

ETDV FS,t(Vcore,1, Vcore,2, .., Vcore,N

)

!,

ETDV FS,ttth

TDVFS V core,i ith

N ETDV FS,T

Vcore,8i TDV FS,t

TDV FS,t+1

Page 16: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Therefore, by exploiting the initial DVFS schedule of the PM, we first divide the overall problem into sub-problems, each of which only concerns how to modify the initial DVFS schedule to optimize the energy saving results of the reactive VRCon in a given period, .– In order to guarantee that the performance (i.e., total execution

time of applications) is not degraded by the modification of DVFS schedule, we impose the constraint that the VRCM can only keep the same or increase (but not decrease) the frequency/voltage level of each core from the original DVFS level suggested by the PM.

– This can be formulated as follows:

• s.t., , for

VRCon: proactive VRCon (6/7)

Mar-27-14 16

TDVFS

f(V new

core,1, Vnew

core,2, .., Vnew

core,N

) < f(V others

core,1 , V others

core,2 , .., V others

core,N

)

V new

core,i

� V PM

core,i

1 i N

Page 17: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• We present a clustering-based heuristic solution as follows:

VRCon: proactive VRCon (7/7)

Mar-27-14 17

- We first sift through the cores driving a small amount of current so that they can be combined with others.

- Next we consolidate two cores (and treat them as one equivalent core) if this merge results in the maximum energy saving.

- The procedure is repeated until no energy saving can be achieved by VR consolidation.

Page 18: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Multicore processor setup– We performed the multicore processor simulations in the Sniper

simulator. The platform configurations were set based on Intel Xeon Nehalem architecture, the topology is shown in the figure below.

– We set the five DVFS levels as follows:

– We modified the codes related to the McPAT module in the Sniper to collect the power and timing data from per-core DVFS.

– The multi-threaded applications from the PARSEC and SPLASH2 benchmarks were used in the simulation.

Experimental work (1/4)

Mar-27-14 18

Core 1 Core 2 Core 3 Core 4

L1-I (32KB)

L1-I (32KB)

L1-I (32KB)

L1-I (32KB)

L1-D (32KB)

L1-D (32KB)

L1-D (32KB)

L1-D (32KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L3 (8MB)

DRAM

Core 12 Core 13 Core 14 Core 15

L1-I (32KB)

L1-I (32KB)

L1-I (32KB)

L1-I (32KB)

L1-D (32KB)

L1-D (32KB)

L1-D (32KB)

L1-D (32KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L2 (256KB)

L3 (8MB)

DRAM

Page 19: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Per-core DVFS simulation– We treat the PM’s DVFS recommendation as given a priori, exploit

an offline DVFS approach as an intermediate step for the overall aim.

– We adopt an ILP based algorithm, as follows:

• s.t., , and

• R is the total interval, and S is the five frequency/voltage levels. Pr,s is the power consumption set by sth frequency/voltage level for rth interval. By following the same notation to Pr,s, Dr,s denotes the incurred delay under the frequency/voltage condition.

• is a certain performance penalty.

Experimental work (2/4)

Mar-27-14 19

min

RX

r

SX

s

Pr,sxr,s

!

RX

r

SX

s

Dr,sxr,s < �

RX

r

SX

s

xr,s = R

Page 20: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• VR-to-core network setup– We selected the programmable VR from LTC3816, which can

power each core in our processor setup, and perform the high efficiency at the average current level of the core obtained from the benchmark simulations.

– We set the number of VRs and cores in one group of the VR-to-core networks to 4.

– We determined the width of the network switch as 8mm based on 45nm technology.

Experimental work (3/4)

Mar-27-14 20

0 2000 4000 6000 8000 10000 12000 14000 16000 18000

20

40

60

80

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

2000

4000

6000

8000

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

5000

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

1000

2000

3000

4000

5000

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

5000

0 2000 4000 6000 8000 10000 12000 14000 16000 180000

1000

2000

3000

4000

5000

data1data2data3data4data5data6

Load current (A)

Effic

ienc

y (%

)

Output voltage: 1.2VOutput voltage: 1.05VOutput voltage: 0.95VOutput voltage: 0.83VOutput voltage: 0.75V

0 2 4 6 8 10 12 14 16 18

20

40

60

80

1234

56

Input voltage: 12V

Pow

er lo

ss (W

)

– We performed LTspice simulation to acquire the VR efficiencies for the various load current under the five output voltage levels.

Page 21: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Simulation results– We define GVR and Gtotal as the energy loss reduction from VRs,

and total energy saving, respectively.– When we ran Streamcluster in 8-core simulator setup, the

resulted enhancements showed GVR = 24.06% and Gtotal = 9.96% from the reactive VRCon, and GVR = 35.86% and Gtotal = 14.85% from both reactive and proactive VRCon.

– The below shows the simulation results from various applications under the different simulator setup.• (I), (II) and (III) indicates 16cores, 8cores and 4cores setups, respectively.

Experimental work (4/4)

Mar-27-14 22

Page 22: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• We addressed the problem of power conversion efficiency in the multicore platform.– Significant power is dissipated by the multiple VRs

to support per-core DVFS.• We proposed the VR consolidation methods with the

configurable VR-to-core distribution network. – The reactive VRCon was presented to configure the network to

enhance the power conversion efficiency under the predetermined DVFS levels.

– The proactive VRCon was proposed to determine new DVFS levels for maximizing system-wide energy saving without performance degradation.

Conclusion

Mar-27-14 23

Page 23: VRCon: Dynamic Reconfiguration of Voltage Regulators in a ...

• Thank you!

Q&A

Mar-27-14 24