HPC VISION FOR CLOUD & EXASCALEarchive.hpcsaudi.org/events/2011_khobar... · Select hyperscale,...

©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Philippe Trautmann

EMEA Sales Manager HPC & POD

Patrick DEMICHEL

Senior Architect HPC

Hyperscale BU / ISS

HPC VISION FOR CLOUD & EXASCALE

Dec.6, 2011

HPC growth creates new opportunities for account growth

Where research, engineering and analysis

is the business

New expectations set Innovation is required

Performance

Efficiency

Agility

Time to innovation Reduced cost & power

Improved quality Response to change

Competitiveness

Win by enabling our customer’s innovation and competitiveness

3

HPC is all around you

Media & Entertainment

•Rendering •Gaming

GeoSciences

•Seismic •Reservoir Modeling

Engineering/ Manufacturing

•Structural Analysis •Fluid Dynamics •Impact Modeling

Government / Classified

•Cryptography •Military/Security •Nuclear Safety

Financial Services

•Financial Analytics •High Freq Trading

Life Sciences

•Drug design •Next gen sequencing •Bio Informatics

Government/ Academic Research

•Particle Physics •Life Sciences •Climate Modeling

Engineering/ Manufacturing

•Electrical Design •Circuit Verification •Board Layout

4

HP delivers innovation at any scale.

Accelerate innovation with HP.

Scalable performance Speed advancements with a converged infrastructure, purpose-built for scale. • Breakthrough performance in systems purpose-built

for scale

Maximum efficiency Optimize your performance footprint with the world‘s most efficient systems. • Continued demand for POD in HPC

Instant-On agility Deploy easily, adapt quickly to change, and improve quality of service. • Breaking the barriers to HPC Cloud

Barriers to Innovation and Scale Realized system performance and throughput Power capacity and cost Infrastructure complexity and inflexibility

The Data Center of the future must be built on a Converged Infrastructure

5

Management software

Network

Servers

Power & cooling

Storage

HP Converged

Infrastructure

Scalable performance

Based on AMD Opteron 6200 processors

New HP ProLiant servers, purpose built for scale

Increased performance for HPC workloads

• Up to 50% performance increase over previous generation servers1

Increased performance per $/watt/ft2

• Up to 2,048 cores per 42U rack, in either 2 or 4 socket systems2

Modular, flexible configurations for HPC workloads

• Up to 18 TB storage3 or up to 1 TB memory,4 per server

New ProLiant BL465c G7 and BL685c G7

New ProLiant DL165 G7, DL365 G7 and

DL585 G7

1. Linpack, 2p 16c IL @ 2.6 GHz vs. 12c MC @ 2.2 GHz 2. BL465c G7 or BL685c G7 3. DL385 G7 with 6 3 TB hot plug LFF SAS drives 4. DL585 G7 or BL685c G7 with 32 GB DIMMS

7

Technology preview: Future HP ProLiant SL6500 systems

New levels of performance for the most demanding workloads • High performance 2p systems based on the

future Intel® Xeon® processor E5 family processors

• Integrated GPUs (up to 1, 3, or 8), optimized for I/O bandwidth to the GPUs

• Integrated high-performance networking, including Mellanox CX3 for 56 Gb/s FDR InfiniBand at full bandwidth

Highly efficient SL6500 multi-node infrastructure • Optimized for performance/$/watt/ft2

* Pictured with assorted current servers

s6500 chassis*

Future SL6500 half-width 2p server

Integrated Mellanox CX3

1U version, up to 1 GPU

2U version, up to 3 GPUs

4U version, up to 8 GPUs

8

Energy, cost and space savings move the industry to new infrastructure

Project Moonshot: Breakthrough Savings and Simplicity

Traditional x86

$3.3M HP ‗Redstone Server‘

$1.2M

89% less energy 94% less space 63% less cost

97% less complexity

400 servers 10 racks

20 switches 1,600 cables 91 kilowatts

1,600 servers 1/2 rack 2 switches 41 cables

9.9 kilowatts

Select hyperscale, web, and data analytics applications show tremendous promise

9 Based on weighted average performance projections for workloads such as web serving, memcached, and Data Analytics. Cost estimates include infrastructure, space, and power and cooling costs over three years.

10

Perfect for development and testing with unparalleled density, flexibility, and simplicity

HP ‗Redstone‘ Server Development Platform

ProLiant SL6500 chassis

HP ‘Redstone’ Development Platform server tray

Up to 72 servers in a single half-width 2U tray

4 trays in a single 4U chassis

Shared SL6500 scalable system enclosure

• Pooled power—4 common slot power supplies

• Shared cooling—8 shared fans, N+1, rear-serviceable

• Integrated, configurable network fabric with up to 16 10Gb uplinks

Up to 288 servers—18 quad node compute

cartridges per server tray

• Calxeda EnergyCore ™ quad-core ARM SoCs w/4MB L2 cache

• Up to 4GB ECC (up to 1333mhz) memory per server

• Integrated management

Shared and configurable storage

• Diskless or up 4 SATA drives (1 drive cartridges) per server

• Up to 192 SSD or 96 2.5‖ SFF HDD per enclosure

Maximum efficiency

Delivering Maximum Density and Serviceability

1/10th the space – Up to 4,400 Servers Heterogeneous, based on industry standards

7X Capacity Per Rack Average 30kW per 50U rack (69kW peak) Closely Coupled Cooling Hot/Cold Aisle Containment

Enhanced Serviceability and Simplicity Hot/Cold Aisle Layout Shared Service Aisle Module Traditional Data Center Service Model

No Compromise Approach to

Modularity and Density

10,000ft2 data center in a compact, serviceable package

40 ft -22 RACK IT POD MODULES

SHARED SERVICE AISLE MODULE

DX COOLING AIR SIDE ECONOMIZERS

EXTRA WIDE MODULAR HOT AISLE

39.5” COLD AISLE SERVICEABILITY

22 50U INDUSTRY STANDARD RACKS

8 FT

12

Full Spectrum of Leading Modular Data Center Alternatives HP Modular Computing Portfolio

Custom HP PODs

Custom Designed by HP Air/Water Cooled

Variety of capacity/footprint

Custom Offerings for Extreme Scale Environments

HP POD 240a

Optimized efficiency 2,200U, 29kW/Rack avg.

POD Benefits/Data Center Feel

Maximum Efficiency, Affordability, and Flexibility

HP POD 20c and 40c

Efficient Power and Cooling Water cooled

Up to 1,100U, 29kW /Rack

Balanced Efficiency and Modularity

HP Flexible Data Center

Traditional facilities design Energy efficient

3.6MW capacity facility

Flexible, Efficient Modular Brick and Mortar Alternative

43 Million square feet. Delivered and Accelerating9

13

Instant-On Agility

HP‘s powerful hyperscale cluster manager taps new Insight technologies

Coming in Q1: Insight HP CMU 7.0

Provision •Simplified discovery

•Fast and scalable

Monitor • ‗At a glance‘

• Lightweight

Control •GUI and CLI options

•Easy, friction-less

CMU

Leverages leadership Insight server management • Simplified configuration and improved performance

with next generation of iLO • Integration of HPSUM to install drivers • Bios version consistency and settings checker • Out-of-band agentless monitoring

• CMU Integration with SIM event management

And more! • Unique 3-D history displays for perf analysis for

Hadoop and Active-Health System • Integration with HP CloudSystem • Tight integration with key HPC resource tools

Cloud & HPC

The Sacred Six - What should a cloud deliver Fundamentals for a Hybrid Cloud

11/27/2011 17

Automated infrastructure-to-app lifecycle management Public, private, hybrid

Broad ecosystem of OS‘s, hypervisors, apps Unified service

delivery

Security

Scalability

HPC CLOUD CHALLENGES

1. COMMODITY INTERCONNECTS

1. HPC apps are in general latency- and often bandwidth-bound

2. Cloud commodity interconnects are inadequate

3. Low-latency interconnects are in general not financially viable in the cloud

4. The question remains how to set up HPC specialized clouds…

2. VIRTUALIZATION & SCHEDULING

1. Virtualization is a cornerstone of cloud computing

2. Several technologies coming to the rescue: multi core and many core, I/O virtualisation

3. Parallel Scheduling is essential to run and complete parallel tasks.

3. INTELLECTUAL PROPERTY

1. When a company‘s future depends on it, trusting the cloud becomes harder

2. Private HPC clouds will be the only alternative for critical applications

A way forward: HPC in the cloud

–HPC in the cloud creates special challenges

–HP and Intel are investigating the formation of a Special Interest Group (SIG)

which would create a hybrid HPC computing environment that spans

workstations to clusters to private and public clouds, enabling technical

computing users to:

• Take advantage of HPC resources that they‘re not yet using

• Expand their usage of HPC, where they‘re constrained in their access to HPC resources

• Use HPC more efficiently, to make better use of resources, and make it more adaptable to

changing workloads

HP delivers

Recent References/Success Stories

21

• Airbus

– The largest industrial supercomputer in the world (as of June‘11)

– 2048 BL280c servers, with storage and QDR InfiniBand

– Deployed in only 4 months in 40‘ PODs in Toulouse and Hamburg

• University Purdue

– Fastest campus system in the US, based on future SL6500 systems

– Deployed in 3 weeks, Top500 #54, 86.87% efficiency

– Largest SandyBridge/FDR system installed today

• ENI Italy

– Large Oil & Gas reference using HP‘s SL6500/SL390s servers

– 1247 SL390s servers, implemented in less than 2 months

How we win – performance, efficiency and agility

Photos courtesy of Airbus

22

HP delivers high-performance innovation at any scale.

ACCELERATE Innovation with HP

Scalable performance

• Speed advancements with a converged infrastructure, purpose-built for scale.

Maximum efficiency

• Optimize your performance footprint with the world‘s most efficient systems.

Instant-On Agility

• Deploy easily, adapt quickly to change, and improve quality of service.

HP Converged

Infrastructure

©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Patrick DEMICHEL

Senior Architect HPC

Hyperscale BU / ISS

HPC VISION FOR CLOUD & EXASCALE

©2011 HP Confidential 24 ©2009 HP Confidential 24 24 ©2009 HP Confidential

Intelligent Infrastructure

END STATE: Capture more value via dramatic computing performance and cost improvements

HP LABS’ RESEARCH CONTRIBUTION: Radical, new approaches for collecting, storing and transmitting data to feed the exascale data center

NEXT-GENERATION SCALABLE STORAGE Cloud-scale, dynamic, secure

NETWORKING Open, flexible, programmable wired and wireless platform

CeNSE Nano-scale sensors creating a Central Nervous System for the Earth

NEXT-GENERATION DATA CENTERS Exascale, photonic interconnects

NON-VOLATILE MEMORY AND STORAGE Memristor

BIG BETS:

©2011 HP Confidential 25

Improve Performance/TCO by 10X Vision for Exascale

– Efficiency: • Interconnects using photons

− 5x (short term: 5years) optical links between nodes − 10x (long term) with nanophotonics (+10x bandwidth)

• Nodes with 256 cores : 10TFlops/200Watts • Memory hierarchy extended with memristors

– Manage: 1 operator for100K nodes

– Autodetec and autorepair failures:

• Check-point Restart integrated and transparent

4 research axes as priorities: – Optical interconnects: Scalability up to 1M nodes

– Basic blocks for compute: Corona project

– System software: 1 operator for100K nodes

– Programmability: Reliability, efficiency

Photonics technologies


Point to point DWDM link

Wavelength (nm) 1310

Wavelengths 64

Channel Spacing (GHz) 80

Modulate Freq (GHz) 10

Data Rate (Gbytes/s) 80

Link Power (mW) 128

Energy fJ/bit 200

Dense Wavelength-Division Multiplexing

©2011 HP Confidential 28 28

Ring Resonators

– A modulator – move in and out of resonance to modulate light on adjacent waveguide

– A switch – transfers light between waveguides only when the resonator is tuned

– A wavelength specific detector - add a doped junction to perform the receive function

One basic structure, 3 applications SiGe Doped


System-level architecture to large-scale integration HP photonics technologies

Now 1 Year 3 Years 5 Years 7 Years 10 Years

Optical Bus Active cable Hybrid laser cable

Silicon PIC On-chip

interconnect

Single wavelength CWDM DWDM

100pJ/bit >.1 pJ/bit

Dev

ices

A

rch

itec

ture

s

Optically connected memory

NODE 0 NODE 1 NODE 2 NODE 3

Optical backplane HyperX & ensemble Corona

System Architecture


Compute Node Architecture – Single socket highly parallel CPU

– Coherency domain is single compute

complex

– Tightly coupled DRAM

• Direct stacking or high performance substrate

– Local checkpointing memory

– Memory expansion through photonically

connected memory stacks

• Option to exploit new memory technologies

– Integrated network interface

• Essential to meet power and bandwidth goals

– Photonics interconnect for all connections off

compute complex

Node Performance Targets

Node Peak Performance12-14Tflops

Memory BW >4Tbyte/s

Node Network BW 400Gbyte/s

Power <200W


System Architecture

– Single converged data network

• Separate fabric device

• Option to vary compute communication ratio

• Heterogeneous systems possible

– Gateway to external network

• Embedded in network for power efficiency

• Tertiary storage accessed via external network

– Node types • Single hardware node architecture

• Distinguish by software (workload and OS)

• Flexible allocation

• Variable memory amounts

– Orthogonal control network • Minimise compute CPU interrupts

• “As simple as possible but no simpler”


128K node system…

16 x 16 array of enclosures

Total fiber ribbons:

76802

2)1515(162

16 COLUMNS

OF ENCLOSURES

16 ROWS OF

ENCLOSURES

32 CA

RDS

= 16*16*5PF = 1.3EF

72 way parallel

fiber ribbon 37 core photonic

crystal fiber (PCF)

15, 8 link fiber cables in S1

15, 8 link fiber cables in S2


PCRAM

Technologies for Check-point Restart

Memristor

HDD NAND Flash PCRAM

Taille cellule - 4-6F^2 4-6F^2

Cycle lecture ~4ms 5us-50us 10ns-100ns

Cycle écriture ~4ms 2ms-3ms 100-1000ns

Watt à arrêt ~1W ~0W ~0W

Endurance cycles 10^15 10^5 10^8

DRAM

PCRAM

CMOS chip avec des composants memrésistifs

L. O. Chua, (1971)

www.nd.edu/~rich/SC09/tut157/SC2009_Jouppi_Xie_Tutorial_Final.pdf

© HP 2009 August 31, 2010 HP Confidential 35

Technology Attributes – Scaling down to less than 10 nm width per cell

• ~ 32 Gbyte/cm2/layer by 2018

– Scaling up to multiple (≥ 8) layers on chip • ~ 0.25 Tbyte/cm2/chip by 2018

– Truly nonvolatile – many, many years

– Random Access

– Fast cell write and erase (~ nanosec)

– Low energy cell write and erase (~ picoJ)

– Good to excellent endurance (> 1010 cycles) • Still counting – goal is to exceed 1018 cycles

© HP 2009 August 31, 2010 HP Confidential 36

Memristor path to NVRAM

– Compete commercially with Flash in ~ 3 years

– Solid-state drives soon thereafter

– Compete with DRAM in ~ 4-5 years

– On processor NV cache in ~ 4-5 years

– Compete with SRAM in ~5-6 years

– Universal NV memory and storage in 7-8 years

– Rethinking memory/storage hierarchy and interfaces now


The complete vision

Software


Software topics – Development

• Algorithms, scalability, Verification&Validation: We expect customers to lead in these areas

• Programming languages, compilers, debuggers, performance tuners, language runtimes, libraries: These are not mainstream HP activities. We expect user community, and commercial ecosystem will provide

– System management software • job control, guaranteed service levels • fault and bottleneck anticipation, discovery, diagnosis • cluster availability • networking • energy minimization • security • storage These are key technologies, of importance in the commercial as well as the scientific sphere. HP will provide robust, efficient software at this level

©2011 HP Confidential 40 Ph

oto

nic

Inte

rconnect

Compute Elements

Memory Elements

NV Memory Elements

Storage Elements

What‘s New Here?

―Computing Ensemble‖: bigger than a

server, smaller than a datacenter, built-in

system software

– Disaggregated pools of uncommitted

compute, memory, and storage elements

– Optical interconnects enable dynamic, on-

demand composition

– Ensemble OS software using virtualization

for composition and management

– Management and programming virtual

appliances add value for IT and

application developers

On-demand composition

Ensemble OS Management

Ensemble Programming


EXASCALE SYSTEM SUPPORT

– Trends • From hardware break-fix to higher levels (software, services)

• Significant integration between serviceability & manageability • Level of automation is critical, move to lower cost deliveries

• Self-healing at lower levels (function of cost)

• Failures in infrastructure transparent to the service customer

– Challenges • e2e automation, noise in data, no faults found

• Knowledge hard to search, store, share, use

• Back-end analysis (forecast, trend), global knowledge, closed loops

– Opportunities • Clean data: resulting from e2e unified serviceability and self-healing

• Actionable knowledge: transparently captured, enabled by clean data

• Backend analysis: simplified by clean data and actionable knowledge

Serviceability

HW Manageability

Service Analytics S

W

Ma

na

ge

ab

ility,

ITIL

reactive

preventive deferred automated

human entered

Serviceability, Delivery Methods

THANK YOU

QUESTIONS?

HPC VISION FOR CLOUD & EXASCALEarchive.hpcsaudi.org/events/2011_khobar... · Select hyperscale,...

Documents

Transcript of HPC VISION FOR CLOUD & EXASCALEarchive.hpcsaudi.org/events/2011_khobar... · Select hyperscale,...