HPC VISION FOR CLOUD & EXASCALEarchive.hpcsaudi.org/events/2011_khobar... · Select hyperscale,...
Transcript of HPC VISION FOR CLOUD & EXASCALEarchive.hpcsaudi.org/events/2011_khobar... · Select hyperscale,...
©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
Philippe Trautmann
EMEA Sales Manager HPC & POD
Patrick DEMICHEL
Senior Architect HPC
Hyperscale BU / ISS
HPC VISION FOR CLOUD & EXASCALE
Dec.6, 2011
HPC growth creates new opportunities for account growth
Where research, engineering and analysis
is the business
New expectations set Innovation is required
Performance
Efficiency
Agility
Time to innovation Reduced cost & power
Improved quality Response to change
Competitiveness
Win by enabling our customer’s innovation and competitiveness
3
HPC is all around you
Media & Entertainment
•Rendering •Gaming
GeoSciences
•Seismic •Reservoir Modeling
Engineering/ Manufacturing
•Structural Analysis •Fluid Dynamics •Impact Modeling
Government / Classified
•Cryptography •Military/Security •Nuclear Safety
Financial Services
•Financial Analytics •High Freq Trading
Life Sciences
•Drug design •Next gen sequencing •Bio Informatics
Government/ Academic Research
•Particle Physics •Life Sciences •Climate Modeling
Engineering/ Manufacturing
•Electrical Design •Circuit Verification •Board Layout
4
HP delivers innovation at any scale.
Accelerate innovation with HP.
Scalable performance Speed advancements with a converged infrastructure, purpose-built for scale. • Breakthrough performance in systems purpose-built
for scale
Maximum efficiency Optimize your performance footprint with the world‘s most efficient systems. • Continued demand for POD in HPC
Instant-On agility Deploy easily, adapt quickly to change, and improve quality of service. • Breaking the barriers to HPC Cloud
Barriers to Innovation and Scale Realized system performance and throughput Power capacity and cost Infrastructure complexity and inflexibility
The Data Center of the future must be built on a Converged Infrastructure
5
Management software
Network
Servers
Power & cooling
Storage
HP Converged
Infrastructure
Scalable performance
Based on AMD Opteron 6200 processors
New HP ProLiant servers, purpose built for scale
Increased performance for HPC workloads
• Up to 50% performance increase over previous generation servers1
Increased performance per $/watt/ft2
• Up to 2,048 cores per 42U rack, in either 2 or 4 socket systems2
Modular, flexible configurations for HPC workloads
• Up to 18 TB storage3 or up to 1 TB memory,4 per server
New ProLiant BL465c G7 and BL685c G7
New ProLiant DL165 G7, DL365 G7 and
DL585 G7
1. Linpack, 2p 16c IL @ 2.6 GHz vs. 12c MC @ 2.2 GHz 2. BL465c G7 or BL685c G7 3. DL385 G7 with 6 3 TB hot plug LFF SAS drives 4. DL585 G7 or BL685c G7 with 32 GB DIMMS
7
Technology preview: Future HP ProLiant SL6500 systems
New levels of performance for the most demanding workloads • High performance 2p systems based on the
future Intel® Xeon® processor E5 family processors
• Integrated GPUs (up to 1, 3, or 8), optimized for I/O bandwidth to the GPUs
• Integrated high-performance networking, including Mellanox CX3 for 56 Gb/s FDR InfiniBand at full bandwidth
Highly efficient SL6500 multi-node infrastructure • Optimized for performance/$/watt/ft2
* Pictured with assorted current servers
s6500 chassis*
Future SL6500 half-width 2p server
Integrated Mellanox CX3
1U version, up to 1 GPU
2U version, up to 3 GPUs
4U version, up to 8 GPUs
8
Energy, cost and space savings move the industry to new infrastructure
Project Moonshot: Breakthrough Savings and Simplicity
Traditional x86
$3.3M HP ‗Redstone Server‘
$1.2M
89% less energy 94% less space 63% less cost
97% less complexity
400 servers 10 racks
20 switches 1,600 cables 91 kilowatts
1,600 servers 1/2 rack 2 switches 41 cables
9.9 kilowatts
Select hyperscale, web, and data analytics applications show tremendous promise
9 Based on weighted average performance projections for workloads such as web serving, memcached, and Data Analytics. Cost estimates include infrastructure, space, and power and cooling costs over three years.
10
Perfect for development and testing with unparalleled density, flexibility, and simplicity
HP ‗Redstone‘ Server Development Platform
ProLiant SL6500 chassis
HP ‘Redstone’ Development Platform server tray
Up to 72 servers in a single half-width 2U tray
4 trays in a single 4U chassis
Shared SL6500 scalable system enclosure
• Pooled power—4 common slot power supplies
• Shared cooling—8 shared fans, N+1, rear-serviceable
• Integrated, configurable network fabric with up to 16 10Gb uplinks
Up to 288 servers—18 quad node compute
cartridges per server tray
• Calxeda EnergyCore ™ quad-core ARM SoCs w/4MB L2 cache
• Up to 4GB ECC (up to 1333mhz) memory per server
• Integrated management
Shared and configurable storage
• Diskless or up 4 SATA drives (1 drive cartridges) per server
• Up to 192 SSD or 96 2.5‖ SFF HDD per enclosure
Maximum efficiency
Delivering Maximum Density and Serviceability
1/10th the space – Up to 4,400 Servers Heterogeneous, based on industry standards
7X Capacity Per Rack Average 30kW per 50U rack (69kW peak) Closely Coupled Cooling Hot/Cold Aisle Containment
Enhanced Serviceability and Simplicity Hot/Cold Aisle Layout Shared Service Aisle Module Traditional Data Center Service Model
No Compromise Approach to
Modularity and Density
10,000ft2 data center in a compact, serviceable package
40 ft -22 RACK IT POD MODULES
SHARED SERVICE AISLE MODULE
DX COOLING AIR SIDE ECONOMIZERS
EXTRA WIDE MODULAR HOT AISLE
39.5” COLD AISLE SERVICEABILITY
22 50U INDUSTRY STANDARD RACKS
8 FT
12
Full Spectrum of Leading Modular Data Center Alternatives HP Modular Computing Portfolio
Custom HP PODs
Custom Designed by HP Air/Water Cooled
Variety of capacity/footprint
Custom Offerings for Extreme Scale Environments
HP POD 240a
Optimized efficiency 2,200U, 29kW/Rack avg.
POD Benefits/Data Center Feel
Maximum Efficiency, Affordability, and Flexibility
HP POD 20c and 40c
Efficient Power and Cooling Water cooled
Up to 1,100U, 29kW /Rack
Balanced Efficiency and Modularity
HP Flexible Data Center
Traditional facilities design Energy efficient
3.6MW capacity facility
Flexible, Efficient Modular Brick and Mortar Alternative
43 Million square feet. Delivered and Accelerating9
13
Instant-On Agility
HP‘s powerful hyperscale cluster manager taps new Insight technologies
Coming in Q1: Insight HP CMU 7.0
Provision •Simplified discovery
•Fast and scalable
Monitor • ‗At a glance‘
• Lightweight
Control •GUI and CLI options
•Easy, friction-less
CMU
Leverages leadership Insight server management • Simplified configuration and improved performance
with next generation of iLO • Integration of HPSUM to install drivers • Bios version consistency and settings checker • Out-of-band agentless monitoring
• CMU Integration with SIM event management
And more! • Unique 3-D history displays for perf analysis for
Hadoop and Active-Health System • Integration with HP CloudSystem • Tight integration with key HPC resource tools
Cloud & HPC
The Sacred Six - What should a cloud deliver Fundamentals for a Hybrid Cloud
11/27/2011 17
Automated infrastructure-to-app lifecycle management Public, private, hybrid
Broad ecosystem of OS‘s, hypervisors, apps Unified service
delivery
Security
Scalability
HPC CLOUD CHALLENGES
1. COMMODITY INTERCONNECTS
1. HPC apps are in general latency- and often bandwidth-bound
2. Cloud commodity interconnects are inadequate
3. Low-latency interconnects are in general not financially viable in the cloud
4. The question remains how to set up HPC specialized clouds…
2. VIRTUALIZATION & SCHEDULING
1. Virtualization is a cornerstone of cloud computing
2. Several technologies coming to the rescue: multi core and many core, I/O virtualisation
3. Parallel Scheduling is essential to run and complete parallel tasks.
3. INTELLECTUAL PROPERTY
1. When a company‘s future depends on it, trusting the cloud becomes harder
2. Private HPC clouds will be the only alternative for critical applications
A way forward: HPC in the cloud
–HPC in the cloud creates special challenges
–HP and Intel are investigating the formation of a Special Interest Group (SIG)
which would create a hybrid HPC computing environment that spans
workstations to clusters to private and public clouds, enabling technical
computing users to:
• Take advantage of HPC resources that they‘re not yet using
• Expand their usage of HPC, where they‘re constrained in their access to HPC resources
• Use HPC more efficiently, to make better use of resources, and make it more adaptable to
changing workloads
HP delivers
Recent References/Success Stories
21
• Airbus
– The largest industrial supercomputer in the world (as of June‘11)
– 2048 BL280c servers, with storage and QDR InfiniBand
– Deployed in only 4 months in 40‘ PODs in Toulouse and Hamburg
• University Purdue
– Fastest campus system in the US, based on future SL6500 systems
– Deployed in 3 weeks, Top500 #54, 86.87% efficiency
– Largest SandyBridge/FDR system installed today
• ENI Italy
– Large Oil & Gas reference using HP‘s SL6500/SL390s servers
– 1247 SL390s servers, implemented in less than 2 months
How we win – performance, efficiency and agility
Photos courtesy of Airbus
22
HP delivers high-performance innovation at any scale.
ACCELERATE Innovation with HP
Scalable performance
• Speed advancements with a converged infrastructure, purpose-built for scale.
Maximum efficiency
• Optimize your performance footprint with the world‘s most efficient systems.
Instant-On Agility
• Deploy easily, adapt quickly to change, and improve quality of service.
HP Converged
Infrastructure
©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ©2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
Patrick DEMICHEL
Senior Architect HPC
Hyperscale BU / ISS
HPC VISION FOR CLOUD & EXASCALE
©2011 HP Confidential 24 ©2009 HP Confidential 24 24 ©2009 HP Confidential
Intelligent Infrastructure
END STATE: Capture more value via dramatic computing performance and cost improvements
HP LABS’ RESEARCH CONTRIBUTION: Radical, new approaches for collecting, storing and transmitting data to feed the exascale data center
NEXT-GENERATION SCALABLE STORAGE Cloud-scale, dynamic, secure
NETWORKING Open, flexible, programmable wired and wireless platform
CeNSE Nano-scale sensors creating a Central Nervous System for the Earth
NEXT-GENERATION DATA CENTERS Exascale, photonic interconnects
NON-VOLATILE MEMORY AND STORAGE Memristor
BIG BETS:
©2011 HP Confidential 25
Improve Performance/TCO by 10X Vision for Exascale
– Efficiency: • Interconnects using photons
− 5x (short term: 5years) optical links between nodes − 10x (long term) with nanophotonics (+10x bandwidth)
• Nodes with 256 cores : 10TFlops/200Watts • Memory hierarchy extended with memristors
– Manage: 1 operator for100K nodes
– Autodetec and autorepair failures:
• Check-point Restart integrated and transparent
4 research axes as priorities: – Optical interconnects: Scalability up to 1M nodes
– Basic blocks for compute: Corona project
– System software: 1 operator for100K nodes
– Programmability: Reliability, efficiency
Photonics technologies
©2011 HP Confidential 27
Point to point DWDM link
Wavelength (nm) 1310
Wavelengths 64
Channel Spacing (GHz) 80
Modulate Freq (GHz) 10
Data Rate (Gbytes/s) 80
Link Power (mW) 128
Energy fJ/bit 200
Dense Wavelength-Division Multiplexing
©2011 HP Confidential 28 28
Ring Resonators
– A modulator – move in and out of resonance to modulate light on adjacent waveguide
– A switch – transfers light between waveguides only when the resonator is tuned
– A wavelength specific detector - add a doped junction to perform the receive function
One basic structure, 3 applications SiGe Doped
©2011 HP Confidential 29
System-level architecture to large-scale integration HP photonics technologies
Now 1 Year 3 Years 5 Years 7 Years 10 Years
Optical Bus Active cable Hybrid laser cable
Silicon PIC On-chip
interconnect
Single wavelength CWDM DWDM
100pJ/bit >.1 pJ/bit
Dev
ices
A
rch
itec
ture
s
Optically connected memory
NODE 0 NODE 1 NODE 2 NODE 3
Optical backplane HyperX & ensemble Corona
System Architecture
©2011 HP Confidential 31
Compute Node Architecture – Single socket highly parallel CPU
– Coherency domain is single compute
complex
– Tightly coupled DRAM
• Direct stacking or high performance substrate
– Local checkpointing memory
– Memory expansion through photonically
connected memory stacks
• Option to exploit new memory technologies
– Integrated network interface
• Essential to meet power and bandwidth goals
– Photonics interconnect for all connections off
compute complex
Node Performance Targets
Node Peak Performance12-14Tflops
Memory BW >4Tbyte/s
Node Network BW 400Gbyte/s
Power <200W
©2011 HP Confidential 32
System Architecture
– Single converged data network
• Separate fabric device
• Option to vary compute communication ratio
• Heterogeneous systems possible
– Gateway to external network
• Embedded in network for power efficiency
• Tertiary storage accessed via external network
– Node types • Single hardware node architecture
• Distinguish by software (workload and OS)
• Flexible allocation
• Variable memory amounts
– Orthogonal control network • Minimise compute CPU interrupts
• “As simple as possible but no simpler”
©2011 HP Confidential 33
128K node system…
16 x 16 array of enclosures
Total fiber ribbons:
76802
2)1515(162
16 COLUMNS
OF ENCLOSURES
16 ROWS OF
ENCLOSURES
32 CA
RDS
= 16*16*5PF = 1.3EF
72 way parallel
fiber ribbon 37 core photonic
crystal fiber (PCF)
15, 8 link fiber cables in S1
15, 8 link fiber cables in S2
©2011 HP Confidential 34
PCRAM
Technologies for Check-point Restart
Memristor
HDD NAND Flash PCRAM
Taille cellule - 4-6F^2 4-6F^2
Cycle lecture ~4ms 5us-50us 10ns-100ns
Cycle écriture ~4ms 2ms-3ms 100-1000ns
Watt à arrêt ~1W ~0W ~0W
Endurance cycles 10^15 10^5 10^8
DRAM
PCRAM
CMOS chip avec des composants memrésistifs
L. O. Chua, (1971)
www.nd.edu/~rich/SC09/tut157/SC2009_Jouppi_Xie_Tutorial_Final.pdf
© HP 2009 August 31, 2010 HP Confidential 35
Technology Attributes – Scaling down to less than 10 nm width per cell
• ~ 32 Gbyte/cm2/layer by 2018
– Scaling up to multiple (≥ 8) layers on chip • ~ 0.25 Tbyte/cm2/chip by 2018
– Truly nonvolatile – many, many years
– Random Access
– Fast cell write and erase (~ nanosec)
– Low energy cell write and erase (~ picoJ)
– Good to excellent endurance (> 1010 cycles) • Still counting – goal is to exceed 1018 cycles
© HP 2009 August 31, 2010 HP Confidential 36
Memristor path to NVRAM
– Compete commercially with Flash in ~ 3 years
– Solid-state drives soon thereafter
– Compete with DRAM in ~ 4-5 years
– On processor NV cache in ~ 4-5 years
– Compete with SRAM in ~5-6 years
– Universal NV memory and storage in 7-8 years
– Rethinking memory/storage hierarchy and interfaces now
©2011 HP Confidential 37
The complete vision
Software
©2011 HP Confidential 39
Software topics – Development
• Algorithms, scalability, Verification&Validation: We expect customers to lead in these areas
• Programming languages, compilers, debuggers, performance tuners, language runtimes, libraries: These are not mainstream HP activities. We expect user community, and commercial ecosystem will provide
– System management software • job control, guaranteed service levels • fault and bottleneck anticipation, discovery, diagnosis • cluster availability • networking • energy minimization • security • storage These are key technologies, of importance in the commercial as well as the scientific sphere. HP will provide robust, efficient software at this level
©2011 HP Confidential 40 Ph
oto
nic
Inte
rconnect
Compute Elements
Memory Elements
NV Memory Elements
Storage Elements
What‘s New Here?
―Computing Ensemble‖: bigger than a
server, smaller than a datacenter, built-in
system software
– Disaggregated pools of uncommitted
compute, memory, and storage elements
– Optical interconnects enable dynamic, on-
demand composition
– Ensemble OS software using virtualization
for composition and management
– Management and programming virtual
appliances add value for IT and
application developers
On-demand composition
Ensemble OS Management
Ensemble Programming
©2011 HP Confidential 41
EXASCALE SYSTEM SUPPORT
– Trends • From hardware break-fix to higher levels (software, services)
• Significant integration between serviceability & manageability • Level of automation is critical, move to lower cost deliveries
• Self-healing at lower levels (function of cost)
• Failures in infrastructure transparent to the service customer
– Challenges • e2e automation, noise in data, no faults found
• Knowledge hard to search, store, share, use
• Back-end analysis (forecast, trend), global knowledge, closed loops
– Opportunities • Clean data: resulting from e2e unified serviceability and self-healing
• Actionable knowledge: transparently captured, enabled by clean data
• Backend analysis: simplified by clean data and actionable knowledge
Serviceability
HW Manageability
Service Analytics S
W
Ma
na
ge
ab
ility,
ITIL
reactive
preventive deferred automated
human entered
Serviceability, Delivery Methods
THANK YOU
QUESTIONS?