How Can EDA Help Solve Challenges in Data Center Energy...

30
How Can EDA Help Solve Challenges in Data Center Energy Efficiency? Ayse K. Coskun Electrical and Computer Engineering Department Boston University http ://people.bu.edu/acoskun http://www.bu.edu/peaclab/ June 5, 2016

Transcript of How Can EDA Help Solve Challenges in Data Center Energy...

Page 1: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

How Can EDA Help Solve Challenges in Data Center Energy Efficiency?

Ayse K. Coskun

Electrical and Computer Engineering Department

Boston University

http://people.bu.edu/acoskun

http://www.bu.edu/peaclab/

June 5, 2016

Page 2: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Energy Efficiency of Computing

2

Emerging applications in big data, cyberphysicalsystems, internet of things, cloud, etc.:• Growing

performance/Watt demand

Source: J. Koomey (Stanford, LBNL), 2011

Page 3: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Energy-related costs are among the largest contributors to the total cost of ownership in data centers

Source: International Data Corporation (IDC)

3

• Data centers consume ~3-4% of US electricity (2011)

• IT is estimated to be responsible of 10% of world energy use (2013)

• Cutting 40% of server room energy waste could save businesses $3B annually (2013)

Energy Efficiency of Computing

Page 4: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Power Grid & Market• Power supply = demand ? ( => blackouts )

• Renewable energy sources: intermittent

• Lack of reliable, large-scale, economical energy storage solutions

• Independent System Operator (ISO): • Demand Response: • Peak Shaving, Capacity Reserves (new)

• Credits provided to the participant who modulates its power consumption dynamically 4

Page 5: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Electricity: >3% of the overall consumption in the US[1]

• Power capping /management techniques • Enable flexibility in power consumption

• Workload flexibility

5

Demand Side –Data Centers

Benefits of Participation• Help solve unstable renewable energy problem

• Provide additional reserves to accommodate other less flexible uses of electricity

• Achieve significant monetary savings

Data centers offer a unique opportunity for providing power capacity reserves.

[1]: J. Koomey. Growth in Data Center Electricity Use 2005 to 2010.Oakland, CA: Analytics Press. August, 1, 2010.

Page 6: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

6

Data Centers in the Smart Grid

Page 7: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

7

Regulation Service (RS) Reserves

Bidding: ( P , R)Price Settling:Get contract

ISO: RS signal

Data Center Regulation

Pcap(t) = + z(t)R

Error:

ε(t) needs to be small: ε(t) > threshold => lose license

Costs: • ΠE and ΠR : market clearing

prices • Credits are reduced based on

statistics of ε(t)

P

e(t) =Preal (t)-Pcap(t)

R

RP RE

Typical PJM 150sec ramp rate (F) and 300sec ramp rate (S) regulation signal trajectories

Credit Earned

Page 8: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Server States:

• Active: Pserver = Pdyn + Pstatic

• Pdyn can be modulated by DVFS or CPU resource limits

• Pdyn = k * RIPS

• Idle: Pserver= Pstatic

• Sleep: Pserver= Psleep

• Constant low power, but resuming from sleep has time delay (tres) and energy cost (Eloss)

• Servicing Model:

8

Data Center Model

Queue

Server 1

*

Allocation

FIFOJob arrival(Homogeneous jobs)

Server 2

Server N

Server i

……

……

Each server: 1 job at a time

[ICCAD’13, ASPDAC’14, IGCC’14]

Page 9: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Server State Transition Rules [Gandhi IGCC12]:

• A server that has been in idle > ttout (timeout threshold):

goes to sleep;

• When a new job arrives:

select the server with the smallest current tidle(t) to activate;

• When we need to force servers to sleep:

select the servers with the largest current tidle(t) to put to sleep.

tidle(t): the time that a server has been in the idle state at time t.

9

Dynamic Power Control Policy

[ICCAD’13, ASPDAC’14, IGCC’14]

Page 10: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Goal: Reduce energy consumption under QoS constraints

• Put all active servers at their maximal throughput (to reduce waste from idle power)

• Determine the minimal number of servers required at each time t, based on:

• the current length of queue

• the overall QoS performance till t

• SLA:

• If additional servers are required: wake them up

• Otherwise: apply server state transition rules for spare servers 10

QoS-feedback

Pserver, j = k j *RIPS j +Pstatic

Nmin =h(S(t)+F(t)+Q(t))-SSLA(t)-FSLA(t)

d

(d,h), d =Treal /Tmin

[ICCAD’13, ASPDAC’14, IGCC’14]

Page 11: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Case 1: Preal(t) < Pcap(t)

1. Active servers with Pserver < Pmax: Pserver Pmax;

2. Existing waiting jobs and idle servers: activate idle servers Pmax;

3. Sleeping servers: resume using server state transition rules.

Do the above three steps in order until Preal(t) = Pcap(t).

• Case 2: Preal(t) > Pcap(t)

1. Active servers with Pserver < Pmax: Pserver -> Pmin;

2. Active servers with Pserver = Pmax: Pserver -> Pmin;

3. Idle servers: suspend using server state transition rules.

Do the above three steps in order until Preal(t) = Pcap(t).

11

Regulation Service (RS)

Pmin set for guaranteeing QoS.

[ICCAD’13, ASPDAC’14, IGCC’14]

Page 12: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Server provisioning for a cluster[ASPDAC’14, IGCC’14]

Goals: • Minimize tracking error• Reduce #transitions• Reduce idle energy waste

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

Power Tracking Error

Pro

ba

bili

ty

Distribution of Power Tracking Error

single server

data center

0 5 10 15 20 250

0.1

0.2

0.3

0.4

0.5

Servicing Time Degradation

Pro

ba

bili

ty

Distribution of Servicing Time Degradation

single server

data center

Regulation Reserves (R) /Avg. Power ( ): • Single Server: 29.7%• 100-server Data

Center: 56.8%

12

e(t) =Preal (t)-Pcap(t)

R

minu(t )ÎU (x(t ))

J(x(t),u(t)) =a1 Preal (t)-Pcap(t) +a2Ntran(t)-a3Nsleep(t)-a4Npeak (t)

Tracking Error Transition Energy Waste Static Energy Waste

Page 13: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

13

Results[ICCAD’13, ASPDAC’14, IGCC’14]

Page 14: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Data Center Power Management

14

Racks

Multi-core server

Multi-level load queues

Data Center

Pcooling

Pcomputing

Sensor feedback

Data center cooling control

Workload arrivals

ISOs

Load forecasting & bidding in the energy market

Optimal control &

allocation of power caps

Regulation requests

Performance,

power &

temperature

models

• Server power capping

• Efficient consolidation

• Cooling control

Page 15: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Power Capping on Multicore Systems

Large Scale Computing System (several racks)

Server(multicore processors)

Individual Server Power Cap

Allocated by a budgeting policyMaintain target power consumptionTypically performed with DVFS

Goal: Adaptively control individual server to maximize performance within power cap

15

Parallel Workloads Increasingly Prevalent

Core 1

Core 3

Core 2

Core 4

Thread 1

Thread 3

Thread 2

Thread 4

Active

Low-Power State

Page 16: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Power Capping on Multicore Systems

Large Scale Computing System (several racks)

Server(multicore processors)

Individual Server Power Cap

Allocated by a budgeting policyMaintain target power consumptionTypically performed with DVFS

Goal: Adaptively control individual server to maximize performance within power cap

16

Parallel Workloads Increasingly Prevalent

Core 1

Core 3

Core 2

Core 4

Thread 1

Thread 3

Thread 2

Thread 4

Active

Low-Power State

Page 17: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Optimal DVFS + Thread Packing(Pack & Cap)

blackscholes canneal fluidanmiate swaptions

17

[ICCAD’11, MICRO’11, IEEE Micro’12]

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

1.60 GHz1 core

2.00 GHz1 core

2.67 GHz1 core

1.60 GHz2 cores

2.00 GHz2 cores

2.67 GHz2 cores

Φ(x1)

Power Cap = 120 W

X XX XX

X XX

X X

X X

XXX

Φ(x2)

Φ(x3)

Φ(x4)

Φ(x5)

Φ(x6)

Page 18: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

18

[ICCAD’11, MICRO’11, IEEE Micro’12]

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

110 W50 s

120 W45 s

130 W40 s

115 W30 s

125 W25 s

135 W20 s

105 W50 s

115 W45 s

125 W40 s

110 W30 s

120 W25 s

130 W20 s

1.60 GHz1 core

2.00 GHz1 core

2.67 GHz1 core

1.60 GHz2 cores

2.00 GHz2 cores

2.67 GHz2 cores

Φ(x1)

Power Cap = 120 W

X XX XX

X XX

X X

X X

XXX

Φ(x2)

Φ(x3)

Φ(x4)

Φ(x5)

Φ(x6)

Sensor and counter inputs

Model LearningOptimal Setting

Calculation

Model Parameters

ϕ(x)

w

y

Statistical classifier to determine most relevant metrics (Logistic regression, L1 regularization, …)

C. ControllerModel Query Controller

Model Lookup

Server Node optimal settings

ϕ(x)

w

y

Runtime Operation

Optimal DVFS + Thread Packing(Pack & Cap)

Page 19: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Adherence to Power Caps

Without a power meter:

• 0 W margin –

82% adherence

• 5 W margin –

96% adherence

• 10 W margin –

99+% adherence

Feedback-control using a power meter improves tracking ability .

19

[ICCAD’11, MICRO’11, IEEE Micro’12]

Page 20: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Temperature vs. Leakage Power

20

[Zapater, DATE’13], [Zapater, Trans. PDS’14]

Page 21: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

• Multi-threading becoming more common (media processing, scientific apps, financial computing, etc.)

• Resource allocation per application / per VM

a key factor in power control & efficient consolidation

21

Virtualization Layer Virtualization Layer

vCPU vCPU vCPU vCPU vCPU vCPU

VM

vCPU vCPU vCPU

VM

vCPU vCPU vCPU vCPU

VM

vCPU vCPU vCPU vCPU vCPU vCPU

Efficient Consolidation[Hankendi, IGCC’13, ISLPED’13]

• Cloud resources for HPC (among other traditional uses of the cloud)

Page 22: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Predicting Performance Scalability

22

[Hankendi, IGCC’13, ISLPED’13]

Estimate the CPU demand of VM:• CPU demand=RUN%+READY%• 97% accuracy for estimating the CPU

demand of the applications• Without requiring offline training

Page 23: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Adaptive Power Capping:vCap

23

VM Monitor

Hypervisor

vCap

Power readings

Estimate CPU demands

Power Cap

Compute Rcap

QoS Req.-Check QoS

and Pcap

violations

-Set CPU

limits

Climit(VMn)

[Hankendi, IGCC’13, ISLPED’13]

Page 24: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Budgeting Computational vs. Cooling Power

24

𝑇𝑚𝑎𝑥

𝑃𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑇𝑐𝑟𝑎𝑐 , 𝑃𝑐𝑜𝑜𝑙𝑖𝑛𝑔

𝑃𝑠𝑒𝑟𝑣𝑒𝑟0

𝑃𝑠𝑒𝑟𝑣𝑒𝑟1

𝑃𝑠𝑒𝑟𝑣𝑒𝑟2

[Tuncer, ICCD’14]

Page 25: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

CoolBudget

25

Offline system

characterization

Wait for

workload

change

Model workloadBudget 𝑃𝑐𝑜𝑚𝑝𝑢𝑡𝑒

for a given 𝑇𝑐𝑟𝑎𝑐

Change 𝑇𝑐𝑟𝑎𝑐

Best

𝑇𝑐𝑟𝑎𝑐found?

Set 𝑇𝑐𝑟𝑎𝑐 and

server powers

Optimization

No

Yes

[ICCD’14]

Page 26: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Computing Energy Efficiency – Take-Aways

1. Complexity of the systems and the physical phenomena

performance power

thermal hot spots and gradients

cooling cost

Performance Reliability

Leakage

energy cost

26

Page 27: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

27

1. Complexity of the systems and the physical phenomena1. Complexity of the

2. Time-varying and diverse application behavior

Computing Energy Efficiency – Take-Aways

Page 28: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

28

1. Complexity of the systems and the physical phenomena1. Complexity of the

2. Time-varying and diverse application behavior

3. Changes in how costis assessed

– e.g., integration with

“provider-side” programs

ISO

Data Center

Computing Energy Efficiency – Take-Aways

Page 29: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Key Aspects of Data Center Efficiency Research

• Intra- & cross-layer optimization

• Application-awareness in all the layers

• Runtime learning and optimization capabilities

• Interactions of physical phenomena

• Strong interdisciplinary aspect: novel materials, architectures, energy markets, etc.

29

Page 30: How Can EDA Help Solve Challenges in Data Center Energy ...eecs.ucf.edu/~jinyier/DASS2016/Coskun_June5_2016.pdfHow Can EDA Help Solve Challenges in Data Center Energy Efficiency? ...

Current graduate students:

Hao Chen, Fulya Kaplan, Tiansheng Zhang, Ozan Tuncer,

Onur Sahin, Emre Ates, Onur Zungur, Yijia Zhang

Collaborators:

D. Atienza & Y. Leblebici @ EPFL,

J. Ayala @ UCM, J. M. Moya @ UPM,

C. Isci, S. Duri @ IBM TJ Watson,

T. Brunschwiler @IBM Zurich,

L. Benini @ ETHZ/U. of Bologna,

M. Caramanis, M. Herbordt, A. Joshi, J. Klamkin and Y. Paschalidis @ BU,

K. Gross & K. Vaidyanathan @ Oracle,

V. Leung, and A. Rodrigues @ Sandia Labs

S. Reda @ Brown University,

D. Tullsen @ UCSD.

Postdoctoral researcher: Dr. Ata Turk

Alumni: Dr. Jie Meng, Dr. Can Hankendi, Nathaniel Michener, Ann Lane, Katsu Kawakami, John Furst, Samuel Howes, Jon Bell, Benjamin Havey, Ryan Mullen

30

RecentFunding:

Performance and Energy Aware Computing Laboratoryhttp://www.bu.edu/peaclab

Many masters students, especially: Dan Rossell, Charlie De Vivero

Visitors: Dr. Marina Zapater, Dr. Andrea Bartolini