…updates…
description
Transcript of …updates…
![Page 1: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/1.jpg)
Powering high-end x86 systems
Aggregate. Scale. Simplify. Save.
…updates…
04/22/20231
![Page 2: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/2.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 2
New Server Virtualization Paradigm
Existing: Partitioning
ENTERPRISE APPLICATIONSApplications requiring fractionof the physical server resources
New: Aggregation
Hypervisor or VMM
Virtual Machines
AppOS
AppOS
AppOS
Virtual Machine
AppOS
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
HIGH PERFORMANCE COMPUTINGApplications requiring superset
of the physical server resources
![Page 3: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/3.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 3
Existing HPC Deployment Models
Scale-OutScale-Up
Applications requiring supersetof the physical server resources
Break the problem to fit the hardwareFit the hardware to the problem size
![Page 4: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/4.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 4
Existing HPC Deployment Models
Scale-OutScale-UpPROS AND CONS
Break the problem to fit the hardwareFit the hardware to the problem size
• Simplified IT infrastructure • Simple and flexible programming• Single system to manage• Consolidated I/O
• Proprietary hardware design• High cost• Architecture lock-in
• High installation & management cost• Complex parallel programming• Multiple operating systems• Cluster file systems, etc.
• Leverages industry standard servers• Low cost• Open architecture
-
+
+
-
![Page 5: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/5.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 5
Existing HPC Deployment Models
Scale-OutScale-UpPROS AND CONS
• Leverages industry standard servers• Low cost• Open architecture
+
Virtual Machine
AppOS
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
Aggregation • Simplified IT infrastructure • Simple and flexible programming• Single system to manage• Consolidated I/O
+
![Page 6: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/6.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 6
vSMP Foundation – BackgroundTHE NEED FOR AGGREGATION - TYPICAL USE CASES
Virtual Machine
AppOS
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
Hypervisor or VMM
vSMP Foundation
Capabilities:Up to 16 nodes:• 32 processors (128 cores)• 4 TB RAMMore at: http://www.scalemp.com/spec
Cluster Management• Requirements driven by IT to simplify
cluster deployment:• Single OS• InfiniBand complexity removal• Simplified I/O: faster scratch storage• Large memory is a plus
• OPEX savings
SMP Replacement• Requirements driven by the end
users per application characteristics:• Large memory • High core-count• IT simplification is a plus
• CAPEX savings
![Page 7: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/7.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 7
Why Aggregate?
Fit the hardware to the problem size• Alternative to costly and proprietary RISC systems• Large memory x86 resource
– Enable larger workloads that cannot be run otherwise• High core-count x86 shared-memory resource with high memory
bandwidth– Allow threaded applications to benefit from shared-memory systems– Reduced development time of custom code using OpenMP (vs. MPI)
OVERCOMING LIMITATIONS OF EXISTING DEPLOYMENT MODELS
AppOS
$$$$$
AppOS
$$$
![Page 8: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/8.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 8
Why Aggregate?
Break the problem to fit the hardware• Ease of use: one system to manage: fewer, larger nodes means less
cluster management overhead– Single Operating System– Avoid cluster file systems– Hide InfiniBand complexities
• Shared I/O– Single process can utilize I/O bandwidth of multiple systems
OVERCOMING LIMITATIONS OF EXISTING DEPLOYMENT MODELS
AppOS
$$$$$$$$
AppOS
AppOS
AppOS
AppOS
AppOS
![Page 9: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/9.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
Simplified Cluster - Example
04/22/2023 9
![Page 10: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/10.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
Customers and Partners
1004/22/2023
Com
mer
cial
Fede
ral
Supp
orte
dPl
atfor
ms
Educ
ation
al
![Page 11: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/11.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 11
Target Environments and Applications
Target Environments• Users seeking to simplify
cluster complexities
• Applications that use large memory footprint (even with one processor)
• Applications that need multiple processors and shared memory
ManufacturingCSM (Computational Structural Mechanics)ABAQUS/ExplicitABAQUS/StandardANSYS MechanicalLSTC LS-DYNAALTAIR Radioss
CFD (Computational Fluid Dynamics)FLUENTANSYS CFXSTAR-CDAVL FIRETgrid
OtherinTrace OpenRT
Life SciencesGaussianVASPAMBERSchrödinger JaguarSchrödinger GlideNAMDDOCKGAMESSGOLDmpiBLASTGROMACSMOLPROOpenEye FREDOpenEye OMEGASCM ADFHMMER
EnergySchlumberger ECLIPSEParadigm GeoDepth3DGEO 3DPSDMNorsar 3D
OthersThe MathWorks MATLABROctaveWolfram MATHEMATICAISC STAR-P
Typical end-user applications
EDAMentorCadenceSynopsys
FinanceWombatKX
![Page 12: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/12.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
04/22/2023 12
vSMP Foundation 2.0
Support for Intel® Nehalem Processor Family– First Nehalem solution with more than 2 processors– Up to 3 times better performance compared to
Harpertown systems– Optimized performance with intra-board memory
placement and QDR InfiniBand
High-availability with dual-rail InfiniBand– 2 InfiniBand switches (dual-rail) in an active-active configuration – Automatic failover on link errors (cable) or switch failure– Improved performance with switch load-balancing (both switches used in
parallel)
Partitioning– Hardware-level isolated partitions, each can run
different OS– Up to 8 partitions, minimum 2 servers per partition– Requires add-on license
Emulex LightPulse® Fibre-Channel HBA Support
Server A Server B Server C
InfiniBand Switch 2InfiniBand Switch 1
Automatic failover and load-balancing
Single Partition
Multiple Partitions
![Page 13: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/13.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
vSMP Foundation 2.0COMPLETE SYSTEM VIEW - NOW AVAILABLE FOR ACADEMIC INSTITUTES !
04/22/2023 13
Before
After
![Page 14: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/14.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
GAUSSIAN
04/22/2023 14
Some Performance Data
![Page 15: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/15.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
Some Performance Data
04/22/2023 15
GAUSSIAN
![Page 16: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/16.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
HW Characteristics:1333MHz - 32 x Intel XEON E5345 QC (Clovertown), 2.33GHz, 2x4MB L2, 1333MHz; 900/960GB (vSMP Foundation 1.7) (Source: ScaleMP)1600MHz - 32 x Intel XEON E5472 QC (Harpertown), 3.00GHz, 2x6MB L2, 1600MHz; 249/288GB (vSMP Foundation 1.7) (Source: ScaleMP)QPI 6.4GT/s - 4 x Intel XEON X5570 QC (Nehalem), 2.93GHz, 8MB L3, QPI 6.4; 9/16GB (vSMP Foundation 1.7) (Source: ScaleMP)
8 16 32 64 1280
30,000
60,000
90,000
120,000
150,000
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
6,157 12,018
23,981
44,111
73,218
9,714
19,145
38,197
75,634
142,280
33,279
66,536
131,462
100%
98% 97%
90%
74%
99% 98% 97%
92%
100%99%
1333MHz FSB (128 cores / 16 boards) 1600MHz FSB (128 cores / 16 boards) QPI 6.4GT/s (32 cores / 4 boards) 1333MHz - Efficiency 1600MHz - EfficiencyQPI - Efficiency
Cores
Band
widt
h (M
B/se
c.)
Effic
ienc
y (c
ompa
red
to 8
cor
es /
1 bo
ards
)
STREAM (OMP) - MB/SEC. (HIGHER IS BETTER)
vSMP Foundation Performance
![Page 17: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/17.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
HW Characteristics:vSMP Foundation™ (QC-8 core): 2 x Intel XEON 5345 QC (Clovertown), 2.33GHz, 2x4MB L2; 908/960GB (vSMP Foundation 1.7) (Source: ScaleMP)vSMP Foundation™ (QC-128 core): 32 x Intel XEON 5345 QC (Clovertown), 2.33GHz, 2x4MB L2; 908/960GB (vSMP Foundation 1.7) (Source: ScaleMP)
164.gzip 175.vpr 176.gcc 181.mcf 186.crafty 197.parser 252.eon 253.perlbmk 254.gap 255.vortex 256.bzip2 300.twolf SPECint_rate2000
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
5,000
0%
20%
40%
60%
80%
100%
120%11
0
78 147
32
242
84
291
197
94 213
113 176
128
1,56
0
922
1,80
3
334
3,73
3
1,00
8
4,32
8
2,33
2
1,00
7
2,94
4
1,49
8
2,28
2
1,62
3
89%
74% 77%
65%
96%
75%
93%
74%
67%
86%83% 81% 79%
vSMP Foundation (8 cores / 1 boards) vSMP Foundation (128 core / 16 boards) Efficiency (128 core to 8 cores)
Benchmark
Rate
vSMP Foundation PerformanceHigher is Bett
er
SPECint_rate_base2000 - RATE (HIGHER IS BETTER)
![Page 18: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/18.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
HW Characteristics:QPI 6.4GT/s - 4 x Intel XEON X5570 QC (Nehalem), 2.93GHz, 8MB L3, QPI 6.4; 9/16GB (vSMP Foundation 1.7) (Source: ScaleMP)
400.pe
rlbench
401.bz
ip2
403.gc
c
429.m
cf
445.go
bmk
456.hm
mer
458.sje
ng
462.lib
quantu
m
464.h2
64ref
471.om
netpp
473.as
tar
483.xa
lancbm
k
SPECint_rat
e_base
2006
0
250
500
750
1,000
1,250
1,500
1,750
2,000
2,250
2,500
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
198
145
189
253
223
168
209
692
293
167
144
250
220
796
560
736
997
889
670
831
2,3
80
1,1
60
651
561
969
857
101%97% 97% 99% 100% 100% 99%
86%
99% 97% 97% 97% 97%
16 Threads (1 board) 64 Threads (4 boards) Efficiency
Rate
Effic
ienc
y
SPECint_rate_base2006 - RATE (HIGHER IS BETTER)
vSMP Foundation Performance
![Page 19: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/19.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
HW Characteristics:vSMP Foundation™ (QC-8 core): 2 x Intel XEON 5345 QC (Clovertown), 2.33GHz, 2x4MB L2; 908/960GB (vSMP Foundation 1.7) (Source: ScaleMP)vSMP Foundation™ (QC-128 core): 32 x Intel XEON 5345 QC (Clovertown), 2.33GHz, 2x4MB L2; 908/960GB (vSMP Foundation 1.7) (Source: ScaleMP)
168.wupwise 171.swim 172.mgrid 173.applu 177.mesa 178.galgel 179.art 183.equake 187.facerec 188.ammp 189.lucas 191.fma3d 200.sixtrack 301.apsi SPECFP_rate2000
0
500
1,000
1,500
2,000
2,500
3,000
3,500
0%
20%
40%
60%
80%
100%
120% 1
00
50
44 1
38
210
265
164
46 105
96
56
63 82 103
93
1,2
82
600
469
1,4
20
3,1
32
2,8
64
1,4
94
561
1,0
61
1,2
67
594
679
1,2
64
1,3
74
1,0
97
80%75%
67% 64%
93%
68%
57%
76%
63%
83%
66% 67%
97%
83%
73%
vSMP Foundation (8 cores / 1 boards) vSMP Foundation (128 core / 16 boards) Efficiency (128 core to 8 cores)
Benchmark
Rate
vSMP Foundation PerformanceHigher is Bett
er
SPECfp_rate_base2000 - RATE (HIGHER IS BETTER)
![Page 20: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/20.jpg)
Aggregate. Scale. Simplify. Save. Confidential and Proprietary
HW Characteristics:QPI 6.4GT/s - 4 x Intel XEON X5570 QC (Nehalem), 2.93GHz, 8MB L3, QPI 6.4; 9/16GB (vSMP Foundation 1.7) (Source: ScaleMP)
410.bw
aves
416.ga
mess
433.m
ilc
434.ze
usmp
435.gr
omacs
436.ca
ctusA
DM
437.les
lie3d
444.na
md
447.de
alII
450.so
plex
453.po
vray
454.ca
lculix
459.G
emsF
DTD
465.to
nto
470.lbm
481.wrf
482.sp
hinx3
SPECfp_rat
e_base
2006
0
100
200
300
400
500
600
700
800
900
1,000
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
166
197
131
193
187
217
119
174
237
131
259
220
107
196
108
206
188
173
621
784
499
759
744
848
449
695
932
473
1,0
30
870
402
794
396
780
702
666
93%100%
95%98% 99% 98%
95%100% 98%
91%
99% 99%94%
101%
92%95% 93%
96%
16 Threads (1 board) 64 Threads (4 boards) Efficiency
Rate
Effic
ienc
y
SPECfp_rate_base2006 - RATE (HIGHER IS BETTER)
vSMP Foundation Performance
![Page 21: …updates…](https://reader035.fdocuments.in/reader035/viewer/2022062814/5681682a550346895dddbe0e/html5/thumbnails/21.jpg)
Powering high-end x86 systems
Aggregate. Scale. Simplify. Save.
Shai FultheimFounder and President
[email protected], +1 (408) 480 1612
04/22/202321