Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band...

19

Transcript of Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band...

Page 1: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement
Page 2: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Power Capping with Cloud Super Apps Performance SLASpeakers:Justin Song @ Alibaba Edmund Song, Nishi Ahuja @Intel

Contributors:Hao Zhu, Guan Wang, Xiaolin Meng @ Alibaba Feng Jiang, Mohan J. Kumar, Xiaoguo Liang, John Leung @ Intel

[H/W Management]

Page 3: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

AgendaMANAGEMENT

Case Studies

• Alibaba Power Capping with Performance SLA• Fine Granularity Power Management Knobs• Redfish Adoption and Practices• Call to Action

Page 4: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Alibaba Power Capping with Performance SLA

MANAGEMENT

Case Studies

Page 5: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Alibaba Power Management ArchitectureMANAGEMENT

Case Studies

Page 6: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Working with AppsMANAGEMENT

Case Studies

Page 7: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Power Capping TriggeringMANAGEMENT

Case Studies

Page 8: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Performance SLAMANAGEMENT

Case Studies

Preventing performance downgrade

DVFS: Dynamic Voltage Frequency Scaling CCx: Core C-States

Page 9: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Results — ExamplesMANAGEMENT

Case Studies

Scenario: high priority instances performance guaranteed with low priority

instances capped

Scenario: capping with no Fmin violation

Page 10: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Fine Granularity Power Management Knobs

MANAGEMENT

Case Studies

Page 11: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Cloud Power-Performance Requirement• Rack density• Utilization of provisioned

resource

• Power-Performance proportionality(e.g. SLA matched energy efficiency)

• Performance-per-watt efficiency

• Convergence between infrastructure plane and resource plane

• Hardware intelligences into resource scheduling and orchestration

Capex Optimization Opex Optimization

SLA and Reliability

MANAGEMENT

Case Studies

Page 12: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Fine Grain Platform Power-Performance KnobsMANAGEMENT

Plat

form

Proc

esso

rCo

re

PowerThermal

Power Frequency

Thermal Package Px/Cx states

Frequency

Core Px/Cx statesOnline/OfflinePower

CPU

DIMM

Storage

Network

Power

Thermal

IDC

Rack

Platform

Efficiency

ReliabilityPerformance

Case Studies

Page 13: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Intel Practices in Cloud Power-Performance Optimization

MANAGEMENT

Case Studies

K8S Open Stack OpenDCM Proprietary Cloud OS

Group Power Capping

Dynamic Core Mgmt.

Peak Shaving via Smart BBU(Turbo Rack)

HW Telemetry Awaress Scheduling

RAPL/DVFS Px/Cx Core Freq.

CPU

Cloud OS

Un-core Freq.Core Off/On

RAPL

DIMM

NIC,Storage…

PlatformNM Rack & DC Facility

Open RMC DC Power

DC Cooling

Cloud Power-Performance Telemetry and Mgmt. API (Redfish Based)

Intel Cloud Power and Performance Solution

Workload PnP Analytic and Simulation

Open BMC

Workload

Smart BBU

Page 14: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Redfish Adoption and Practices

MANAGEMENT

Case Studies

Page 15: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

• Interoperability - Infrastructure plane vs. Resource plane, server vs. facility (rack, IDC) etc.

• Consistent API model for In band interface and Out of band interface• Support runtime configuration and cloud scale deployment

API Requirement in Cloud Power-Performance Optimization

Host OS(In-band)Open BMC

(Out of band)

Redfish API Redfish API

Rack, Rack BBU

Open RMC

Redfish API

IDC

Redfish API

Power System Cooling System

Managed Computing Platform

RedfishOpenEDKII

Cloud OS(Infrastructure and Orchestration)

Resource plane Infrastructure plane

MANAGEMENT

Case StudiesStarting with extensible and scalable Redfish model

Page 16: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Redfish Practices• Unified power control API model to support hierarchical

power capping (platform, rack, cluster)• Consistent model to support RAPL and Smart BBS

based peak shaving.

• Orchestration need insight of power redundancy of managed system for intelligent scheduling policy

• Describe complex power redundancy model which is across platform, chassis and rack

"PowerControl": [ { …. "PowerMetrics": { "MinConsumedWatts": 7200, "MaxConsumedWatts": 7900, "AverageConsumedWatts": 7600 }, "PowerLimit": { "LimitInWatts": 7900, "DynamicPeakCapacityWatts": 1000, "DynamicPeakCapacityDuration": "P3S", "LimitException": "LogEventOnly", "CorrectionInMs": 50 }, "Status": { "State": "Enabled", "Health": "OK" }, } ],

Rack

Filter

PSU……

PSUBBU

Power Shelf

BBSPDU ServersServer

12v or 48v

12v or 48v

Server

ServerAC

AC

…BBU

AC – Feed A

PDUAC – Feed B

R2 R3

R5

R1

R4

MANAGEMENT

Case Studies

Redfish based API reduce deployment complexity of data center power management

Page 17: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

Call to Action

MANAGEMENT

Case Studies

Page 18: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement

MANAGEMENT

Case Studies

• Performance SLA driven power optimization is critical for TCO optimization and PUE efficiency. Need platform and solution co-innovation to support dynamic, flexible and workload aware optimization policies.

• Open and standard API able to reduce deployment cost in large scale cloud environment. Need collaboration to define common telemetries and control interface for OCP platform, e.g. via baseline OCP HW Mgmt. profile.

• Cloud developers and users need to understand and define their performance requirements for their cloud apps

• Get involved:https://www.opencompute.org/projects/hardware-management

Call to Action

Page 19: Power Capping with Cloud.… · • Consistent API model for In band interface and Out of band interface • Support runtime configuration and cloud scale deployment API Requirement