Download - Robert Deaver Sept. 24 th 2009 Energy Management for Servers and Clusters.

Robert Deaver Sept. 24 th 2009 Energy Management for Servers and Clusters

Why Manage Energy 2 Reduce cost Rack energy usage could account for 23%-50% of collocation revenue [Elnozahy] Rate tariffs or up-front deposits required by utility companies [J. Mitchell-Jackson] Reduce heat Also reduces cost Allows higher server density Reducing heat reduces failures

Two Approaches 3 Single servers Energy Conservation Policies for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of the 4 th USENIX Symposium on Internet Technologies and Systems, March 2003 Server clusters Multi-mode Energy Management for Multi-tier Server Clusters: Horvath and Skadron, PACT 2008

Energy Conservation Policies for Web Servers 4 Three policies for energy reduction Dynamic Voltage Scaling (DVS) Request Batching DVS + Request Batching All 3 policies trade system responsiveness to conserve energy Results evaluated using simulator and hardware testbed

The Policies 5 Focus on reducing CPU energy CPU is the dominant consumer [Bohrer] CPU exhibits the most variation in energy consumption Feedback-driven Control Framework Administrator specifies a percentile-based response time goal Most experiments use 50ms 90 th percentile response-time goal

The Policies: DVS 6 Varies CPU frequency and voltage to conserve energy while meeting response time requirements Most beneficial for moderate workload Not task based! Task based approach works well for desktop environment but not server environment Ad-Hoc Controller Is Response time goal being met? Decrease CPU Freq Increase CPU Freq Yes No

The Policies: Request Batching 7 1. Delay servicing of incoming 2. Keep CPU in low power state 3. Packets accumulate in buffer 4. When a packet has been kept pending for longer than specified batching timeout wake up and process requests. If CPUs low power state saves 2.5W and server utilization is 25% it is possible to save 162KJ/day Most beneficial for very light workload Is Response time goal being met? Decrease Batching Timeout Yes No Increase Batching Timeout

The Polices: Combined 8 Uses request batching when workload is very light Uses DVS when workload is moderate

Workloads 9 Constructed from web server logs Extended by modifying the inter-arrival time of connections by a scalefactor WorkloadOlympics98FinanceDisk Intense Avg requests (Peak Requests) / sec97 (171)16 (46)15 (30) Avg requests / connection128.531 Unique files (Total File Size)61,807 (795 MB)16,872 (171 MB)698,232 (6,205 MB) Distinct HTTP Requests8,370,0931,360,8861,290,196 Total Response size (excl http headers)49,871 MB2,811 MB10,172 MB 97% / 98% / 99% (MB)24.8 / 50.9 / 1413.74 / 6.46 13.92,498 / 2,860 / 3,382

Salsa a simulator 10 Estimates energy consumption and response time of web server Based on queuing model built using CSIM execution engine Models process scheduling and file cache hits and misses Validated against real hardware Hardware Model CPU Frequency600 MHz P max 27.2W P idle 4.97W P DeepSleep 2.47W DVS Range300MHz-600Mhz in 33Mhz steps

Prototype 11 Used to validate Salsa Specs: 600MHz CPU 2.4.3 Linux Kernel Apache web server Does not place CPU into low power state Does not use response time feedback control Salsa is run in open-loop mode for validation

Validation: Energy 12 Batched requests for 11,953s. Salsa predicted 12,373s 3.5% Error

Validation: Response Time 13 4.7% Error

Evaluation: DVS (Response Time) 14 Heavier Workload

Evaluation: DVS (Workload) 15

Evaluation: Request Batching (Response Time) 16 Heavier Workload

Evaluation: Request Batching (Workload) 17

Evaluation: DVS vs. Request Batching 18 Energy savings dependent on workload Both Policies effective for energy conservation WorkloadOlympics98-4xFinance-12xDisk-Intense-2x Base Energy (J) 1,254,672739,212663,648 Base 90 th Percentile Response Time (ms) 12.36.43.0 DVS Joules (% savings) 915,204 (27%)518,844 (30%)494,982 (25%) Request Batching Joules (% savings) 1,166,128 (7.0%)606,468 (18.0%)525,836 (20.8%)

Evaluation: Combined Policy 19 Finance-12x, 50ms 90 th Percentile Response Time Goal

Evaluation: Combined Policy vs. DVS vs. Request Batching 20

Evaluation: Combined Policy 21

Evaluation: Combined Policy 22

Faster Processors 23 Current CPU clock rates >> 600 MHz DVS savings (% energy consumed) remains same Request Batching savings increase Results have not been validated against real hardware Hardware Model CPU Frequency3.0GHz P max 60W P idle 10W P DeepSleep 5W DVS Range1.5GHz-3.0GHz in steps of 150MHz

Faster Processors: Simulation Results 24

Related Work 25 DVS CPU utilization over intervals used to predict future utilization [Govil][Weiser] CPU Freq/Voltage set on per task basis [Flautner] Perform well for desktop systems but not in server environment Simulation Wattch, microprocessor power analysis tool [Brooks et. al.] PowerScope, tool for profiling application energy use [Flinn et. al.] Salsa is substantially faster because it is targeted for web workloads

Conclusions 26 DVS Vary CPU frequency and voltage to save energy Most energy savings with medium workloads Request Batching Group requests and process them in batches when server is under-utilized and keep CPU in sleep mode as much as possible Most energy savings with light workloads DVS + Request Batching Best of both policies! Saves 17%-42% of CPU energy across broad range of workloads

Critique 27 Request Batching is never compared to policy that uses deepsleep but does not batch requests DVFS and Request Batching controllers are ad-hoc solutions, no controls analysis Only tested on static content

Two Approaches 28 Single servers Energy Conservation Policies for Web Servers: Elnozahy, Kistler, and Rajamony, proceedings of the 4 th USENIX Symposium on Internet Technologies and Systems, March 2003 Server clusters Multi-mode Energy Management for Multi-tier Server Clusters: Horvath and Skadron, PACT 2008

Multi-mode Energy Management for Multi-tier Server Clusters 29 Use DVS and multiple sleep states to manage energy consumption for a server cluster Theoretical analysis of power optimization Validate policies using a multi-tier server cluster Cluster wide energy savings up to 25% with no performance degradation!

Current Solutions 30 Focus on active portion of cluster Dynamic Voltage Scaling (DVS) Used on a per server basis Increases power efficiency by slowing down CPU Dynamic cluster reconfiguration Load consolidated on subset of servers Unused servers (after consolidation) are shut down

Related work 31 Distributing demand to cluster subset [Pinheiro et al.] PID controller used to compensate for transient demand variations 45% energy savings Static web workload was interface-bound with peak CPU utilization of 25% Assume machines have lower than actual capacity to compensate wakeup latency Cluster reconfiguration combined with DVS [Elnozahy et al.] Assume a cubic relation between CPU frequency and power Very different results due to different power model

Outline 32 Models Energy Management and optimization Policies Experiments and Analysis

System Model 33 Multi-tier server cluster All machines in one tier run same application Requests go through all tiers End to end performance is subject to a Service Level Agreement (SLA) Assumptions All machines in a single tier have identical power and performance characteristics Load balancing within a tier is perfect. Required for analytical tractability. Observations show moderate imbalances are insignifficant!

Power Model 34 Obtained power model through power measurements of a large pool of characterization experiments varying U i and f i Power usage is approximately linear

Power Model 35 P i :Power U i :Utilization f i :Frequency Parameters a ij are found through curve fitting Test system had average error of 1%

Service Latency Model (SLM) 36 Service latency of short requests is mostly a function of CPU utilization Offered load i can be estimated by Prediction 1. Estimate current i from measurements 2. Predict Ui based on i SLM obtained via regression analysis using a heuristically decided format

Outline 37 Models Energy Management and Optimization Policies Experiments and Analysis

Multi-mode Energy Management 38 Must consider active and idle (sleeping) nodes Minimization of E transition less important as workload fluctuations for Internet servers fluctuate on larger time scale

Active Energy Optimization 39 Assigns machines to tiers Determines their operating frequencies Energy management strategy is optimal iff: Total power consumption is minimal SLA is met

Sleep Energy Optimization 40 Servers may support up to n sleep states (S-states) Assumptions: Workload spikes are unpredictable and arbitrarily large spikes are not supported Maximum Accommodated Load Increase Rate (MALIR, ) is defined to ensure system can meet target SLA E sleep minimized by placing each unallocated server in the deepest possible state subject to MALIR constraint S1S1 S0S0 SnSn Power Level (p i ) Wake-up Latency ( i )

Feasible Wakeup Schedule 41 Minimum number of servers for each sleep state? If load increases with rate , cluster must wake up machines in to respond! Feasible wakeup schedule exists iff: c: cluster capacityd: demand assume: c(t 0 ), d(t 0 ) are known

Spare Servers 42 Optimal number of spare servers for each sleep state: Discretized: Note: Derivation included in paper

Outline 43 Models Energy Management and Optimization Policies Experiments and Analysis

Active capacity Policy 44 Brute force: Exhaustive search of all possible cluster configurations Does not scale to large clusters! Heuristic approach: Assumes never save power by powering on a machine and lowering cluster CPU frequency [Pinheiro et al.] Takes 2 rounds of calculations Similar to queuing theory based approach by Chen et al.

Spare Server Policy - Optimal 45 S1 S2 Sn - 1 Sn # Idle nodes in S0 > S0* Done No Place Idle nodes in an S state Yes

Spare Server Policy - Demotion 46 Maintain list that contains count of idle machines and time each smaller count was first seen. During each control period The list is used to determine the optimal number of machines for each sleep state Working from state on to deeper states, nodes are demoted to states that have a deficit of machines, starting with the deepest state

Spare Server Policy - Demotion 47 List of timestamps initialized empty idle_since t0t0 t0t0

t 10 t1 t2 t3 Spare Server Policy - Demotion 48 idle_since t0t0 6 Machines Idling t10 # machines idling sizeof(idle_since)

idle_since t 10 t1 t2 t3 Spare Server Policy - Demotion 49 idle_since t0t0 2 Machines Idling # machines idling sizeof(idle_since)

Spare Server Policy - Demotion 50 idle_since t 50 S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50

Spare Server Policy - Demotion 51 idle_since t 50 S-StateRunningIdle S062 S32 S42 t10 t20 t30 t50 Idle_sincei=3i=4 t10x t20x t30x x x t50x 42 t>t* + w i 2

Load Estimation 52 Performance monitors Detect when response times exceed D Detect errors from overload (timeouts) Once either monitor triggers a fault, feedback controls used to drive the performance within SLA spec

Outline 53 Models Energy Management and optimization Policies Experiments and Analysis

Experiment 54 12-node 4 tier Web server cluster Front end load balancer Web (HTTP) servers Application servers Database servers Baseline, cluster statically provisioned for peak load Test load is a 3-tier implementation of TPC-W benchmark

Performance and Energy Efficiency 55

Total Energy Savings 56

Key Observations 57 Energy savings of 6-14% by exploiting multiple sleep states Average gain for demotion is 10%, optimal 7% Optimal workload sensitivity (7-9%) is smaller than demotion (6-14%) Optimal overall winner, 7% savings over demotion

Conclusions 58 Can save energy in server clusters in both active and spare capacities Active capacity optimization can be achieved through DVS Spare capacity optimization can be achieved through using multiple sleep states Multiple sleep states save up to 50% more energy than Off- Only solution Optimal policy is superior to Demotion policy

Critique 59 Notation is not always clearly defined Algorithm explanations hard to follow Newest server trace was 10 years old when paper was published!

Request Batching v. Cluster Reconfiguration Request BatchingCluster Reconfiguration 60 Focuses on single web servers with light load Uses primarily simulation and hardware test bed Ad-hoc controllers Focuses on tiered server clusters with varying load Has mathematical foundation Control theory based controllers