Robert Deaver Sept. 24 th 2009 Energy Management for Servers
and Clusters
Slide 2
Why Manage Energy 2 Reduce cost Rack energy usage could account
for 23%-50% of collocation revenue [Elnozahy] Rate tariffs or
up-front deposits required by utility companies [J.
Mitchell-Jackson] Reduce heat Also reduces cost Allows higher
server density Reducing heat reduces failures
Slide 3
Two Approaches 3 Single servers Energy Conservation Policies
for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of
the 4 th USENIX Symposium on Internet Technologies and Systems,
March 2003 Server clusters Multi-mode Energy Management for
Multi-tier Server Clusters: Horvath and Skadron, PACT 2008
Slide 4
Energy Conservation Policies for Web Servers 4 Three policies
for energy reduction Dynamic Voltage Scaling (DVS) Request Batching
DVS + Request Batching All 3 policies trade system responsiveness
to conserve energy Results evaluated using simulator and hardware
testbed
Slide 5
The Policies 5 Focus on reducing CPU energy CPU is the dominant
consumer [Bohrer] CPU exhibits the most variation in energy
consumption Feedback-driven Control Framework Administrator
specifies a percentile-based response time goal Most experiments
use 50ms 90 th percentile response-time goal
Slide 6
The Policies: DVS 6 Varies CPU frequency and voltage to
conserve energy while meeting response time requirements Most
beneficial for moderate workload Not task based! Task based
approach works well for desktop environment but not server
environment Ad-Hoc Controller Is Response time goal being met?
Decrease CPU Freq Increase CPU Freq Yes No
Slide 7
The Policies: Request Batching 7 1. Delay servicing of incoming
2. Keep CPU in low power state 3. Packets accumulate in buffer 4.
When a packet has been kept pending for longer than specified
batching timeout wake up and process requests. If CPUs low power
state saves 2.5W and server utilization is 25% it is possible to
save 162KJ/day Most beneficial for very light workload Is Response
time goal being met? Decrease Batching Timeout Yes No Increase
Batching Timeout
Slide 8
The Polices: Combined 8 Uses request batching when workload is
very light Uses DVS when workload is moderate
Slide 9
Workloads 9 Constructed from web server logs Extended by
modifying the inter-arrival time of connections by a scalefactor
WorkloadOlympics98FinanceDisk Intense Avg requests (Peak Requests)
/ sec97 (171)16 (46)15 (30) Avg requests / connection128.531 Unique
files (Total File Size)61,807 (795 MB)16,872 (171 MB)698,232 (6,205
MB) Distinct HTTP Requests8,370,0931,360,8861,290,196 Total
Response size (excl http headers)49,871 MB2,811 MB10,172 MB 97% /
98% / 99% (MB)24.8 / 50.9 / 1413.74 / 6.46 13.92,498 / 2,860 /
3,382
Slide 10
Salsa a simulator 10 Estimates energy consumption and response
time of web server Based on queuing model built using CSIM
execution engine Models process scheduling and file cache hits and
misses Validated against real hardware Hardware Model CPU
Frequency600 MHz P max 27.2W P idle 4.97W P DeepSleep 2.47W DVS
Range300MHz-600Mhz in 33Mhz steps
Slide 11
Prototype 11 Used to validate Salsa Specs: 600MHz CPU 2.4.3
Linux Kernel Apache web server Does not place CPU into low power
state Does not use response time feedback control Salsa is run in
open-loop mode for validation
Slide 12
Validation: Energy 12 Batched requests for 11,953s. Salsa
predicted 12,373s 3.5% Error
Evaluation: DVS vs. Request Batching 18 Energy savings
dependent on workload Both Policies effective for energy
conservation WorkloadOlympics98-4xFinance-12xDisk-Intense-2x Base
Energy (J) 1,254,672739,212663,648 Base 90 th Percentile Response
Time (ms) 12.36.43.0 DVS Joules (% savings) 915,204 (27%)518,844
(30%)494,982 (25%) Request Batching Joules (% savings) 1,166,128
(7.0%)606,468 (18.0%)525,836 (20.8%)
Evaluation: Combined Policy vs. DVS vs. Request Batching
20
Slide 21
Evaluation: Combined Policy 21
Slide 22
Evaluation: Combined Policy 22
Slide 23
Faster Processors 23 Current CPU clock rates >> 600 MHz
DVS savings (% energy consumed) remains same Request Batching
savings increase Results have not been validated against real
hardware Hardware Model CPU Frequency3.0GHz P max 60W P idle 10W P
DeepSleep 5W DVS Range1.5GHz-3.0GHz in steps of 150MHz
Slide 24
Faster Processors: Simulation Results 24
Slide 25
Related Work 25 DVS CPU utilization over intervals used to
predict future utilization [Govil][Weiser] CPU Freq/Voltage set on
per task basis [Flautner] Perform well for desktop systems but not
in server environment Simulation Wattch, microprocessor power
analysis tool [Brooks et. al.] PowerScope, tool for profiling
application energy use [Flinn et. al.] Salsa is substantially
faster because it is targeted for web workloads
Slide 26
Conclusions 26 DVS Vary CPU frequency and voltage to save
energy Most energy savings with medium workloads Request Batching
Group requests and process them in batches when server is
under-utilized and keep CPU in sleep mode as much as possible Most
energy savings with light workloads DVS + Request Batching Best of
both policies! Saves 17%-42% of CPU energy across broad range of
workloads
Slide 27
Critique 27 Request Batching is never compared to policy that
uses deep- sleep but does not batch requests DVFS and Request
Batching controllers are ad-hoc solutions, no controls analysis
Only tested on static content
Slide 28
Two Approaches 28 Single servers Energy Conservation Policies
for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of
the 4 th USENIX Symposium on Internet Technologies and Systems,
March 2003 Server clusters Multi-mode Energy Management for
Multi-tier Server Clusters: Horvath and Skadron, PACT 2008
Slide 29
Multi-mode Energy Management for Multi-tier Server Clusters 29
Use DVS and multiple sleep states to manage energy consumption for
a server cluster Theoretical analysis of power optimization
Validate policies using a multi-tier server cluster Cluster wide
energy savings up to 25% with no performance degradation!
Slide 30
Current Solutions 30 Focus on active portion of cluster Dynamic
Voltage Scaling (DVS) Used on a per server basis Increases power
efficiency by slowing down CPU Dynamic cluster reconfiguration Load
consolidated on subset of servers Unused servers (after
consolidation) are shut down
Slide 31
Related work 31 Distributing demand to cluster subset [Pinheiro
et al.] PID controller used to compensate for transient demand
variations 45% energy savings Static web workload was
interface-bound with peak CPU utilization of 25% Assume machines
have lower than actual capacity to compensate wakeup latency
Cluster reconfiguration combined with DVS [Elnozahy et al.] Assume
a cubic relation between CPU frequency and power Very different
results due to different power model
Slide 32
Outline 32 Models Energy Management and optimization Policies
Experiments and Analysis
Slide 33
System Model 33 Multi-tier server cluster All machines in one
tier run same application Requests go through all tiers End to end
performance is subject to a Service Level Agreement (SLA)
Assumptions All machines in a single tier have identical power and
performance characteristics Load balancing within a tier is
perfect. Required for analytical tractability. Observations show
moderate imbalances are insignifficant!
Slide 34
Power Model 34 Obtained power model through power measurements
of a large pool of characterization experiments varying U i and f i
Power usage is approximately linear
Slide 35
Power Model 35 P i :Power U i :Utilization f i :Frequency
Parameters a ij are found through curve fitting Test system had
average error of 1%
Slide 36
Service Latency Model (SLM) 36 Service latency of short
requests is mostly a function of CPU utilization Offered load i can
be estimated by Prediction 1. Estimate current i from measurements
2. Predict Ui based on i SLM obtained via regression analysis using
a heuristically decided format
Slide 37
Outline 37 Models Energy Management and Optimization Policies
Experiments and Analysis
Slide 38
Multi-mode Energy Management 38 Must consider active and idle
(sleeping) nodes Minimization of E transition less important as
workload fluctuations for Internet servers fluctuate on larger time
scale
Slide 39
Active Energy Optimization 39 Assigns machines to tiers
Determines their operating frequencies Energy management strategy
is optimal iff: Total power consumption is minimal SLA is met
Slide 40
Sleep Energy Optimization 40 Servers may support up to n sleep
states (S-states) Assumptions: Workload spikes are unpredictable
and arbitrarily large spikes are not supported Maximum Accommodated
Load Increase Rate (MALIR, ) is defined to ensure system can meet
target SLA E sleep minimized by placing each unallocated server in
the deepest possible state subject to MALIR constraint S1S1 S0S0
SnSn Power Level (p i ) Wake-up Latency ( i )
Slide 41
Feasible Wakeup Schedule 41 Minimum number of servers for each
sleep state? If load increases with rate , cluster must wake up
machines in to respond! Feasible wakeup schedule exists iff: c:
cluster capacityd: demand assume: c(t 0 ), d(t 0 ) are known
Slide 42
Spare Servers 42 Optimal number of spare servers for each sleep
state: Discretized: Note: Derivation included in paper
Slide 43
Outline 43 Models Energy Management and Optimization Policies
Experiments and Analysis
Slide 44
Active capacity Policy 44 Brute force: Exhaustive search of all
possible cluster configurations Does not scale to large clusters!
Heuristic approach: Assumes never save power by powering on a
machine and lowering cluster CPU frequency [Pinheiro et al.] Takes
2 rounds of calculations Similar to queuing theory based approach
by Chen et al.
Slide 45
Spare Server Policy - Optimal 45 S1 S2 Sn - 1 Sn # Idle nodes
in S0 > S0* Done No Place Idle nodes in an S state Yes
Slide 46
Spare Server Policy - Demotion 46 Maintain list that contains
count of idle machines and time each smaller count was first seen.
During each control period The list is used to determine the
optimal number of machines for each sleep state Working from state
on to deeper states, nodes are demoted to states that have a
deficit of machines, starting with the deepest state
Slide 47
Spare Server Policy - Demotion 47 List of timestamps
initialized empty idle_since t0t0 t0t0
Slide 48
t 10 t1 t2 t3 Spare Server Policy - Demotion 48 idle_since t0t0
6 Machines Idling t10 # machines idling sizeof(idle_since)
Slide 49
idle_since t 10 t1 t2 t3 Spare Server Policy - Demotion 49
idle_since t0t0 2 Machines Idling # machines idling
sizeof(idle_since)
Slide 50
Spare Server Policy - Demotion 50 idle_since t 50
S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50
Slide 51
Spare Server Policy - Demotion 51 idle_since t 50
S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50 Idle_sincei=3i=4
t10x t20x t30x x x t50x 42 t>t* + w i 2
Slide 52
Load Estimation 52 Performance monitors Detect when response
times exceed D Detect errors from overload (timeouts) Once either
monitor triggers a fault, feedback controls used to drive the
performance within SLA spec
Slide 53
Outline 53 Models Energy Management and optimization Policies
Experiments and Analysis
Slide 54
Experiment 54 12-node 4 tier Web server cluster Front end load
balancer Web (HTTP) servers Application servers Database servers
Baseline, cluster statically provisioned for peak load Test load is
a 3-tier implementation of TPC-W benchmark
Slide 55
Performance and Energy Efficiency 55
Slide 56
Total Energy Savings 56
Slide 57
Key Observations 57 Energy savings of 6-14% by exploiting
multiple sleep states Average gain for demotion is 10%, optimal 7%
Optimal workload sensitivity (7-9%) is smaller than demotion
(6-14%) Optimal overall winner, 7% savings over demotion
Slide 58
Conclusions 58 Can save energy in server clusters in both
active and spare capacities Active capacity optimization can be
achieved through DVS Spare capacity optimization can be achieved
through using multiple sleep states Multiple sleep states save up
to 50% more energy than Off- Only solution Optimal policy is
superior to Demotion policy
Slide 59
Critique 59 Notation is not always clearly defined Algorithm
explanations hard to follow Newest server trace was 10 years old
when paper was published!
Slide 60
Request Batching v. Cluster Reconfiguration Request
BatchingCluster Reconfiguration 60 Focuses on single web servers
with light load Uses primarily simulation and hardware test bed
Ad-hoc controllers Focuses on tiered server clusters with varying
load Has mathematical foundation Control theory based
controllers