NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Early...
-
Upload
forrest-bierly -
Category
Documents
-
view
215 -
download
0
Transcript of NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE SAN DIEGO SUPERCOMPUTER CENTER Early...
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Early Experiences with Datastar:A 10TF Power4 + Federation system
Giri ChukkapalliSDSC
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Outline
• Datastar specs and setup• Porting issues from BH to DS• Initial setup difficulties and current DS status• Initial science highlights• Benchmark codes• Results• Conclusions
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
SDSC
• National NSF center with compute and data resources allocated freely through peer review process
• Transitioning from NPACI to CyberInfrastructure through Teragrid
• SDSC’s emphasis and contribution to the national CI initiative is Data intensive computing
• Acquired Datastar at the beginning of the year to replace Bluehorizon as the main compute engine
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
DataStar
• 10.1 TF, 1760 processors total• 11 32-way 1.7 GHz IBM p690s
• 2 nodes 64 GB memory for login and interactive use• 6 nodes 128 GB memory for scientific computation• 2 nodes 128 GB memory for database, DiscoveryLink• 1 node 256 GB memory for batch scientific computation• All p690s connected to Gigabit Ethernet with 10 GE coming soon
• 176 8-way 1.5 GHz IBM p655• 16 GB memory• Batch scientific computation
• All nodes Federation switch attached• All nodes SAN attached
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
SDSC DataStar
187 Total Nodes11 p690
176 p655TeraGrid network to L.A.
30 Gb/s
HPSS
SAMQFS
NFS GPFS
Gigabit Ethernet
Login (1)Interactive (1)
DatabaseBatch
p690 Nodes
Storage AreaNetwork(SAN)
Interactive (171)Batch (5)
Federation Switch p655 Nodes
Tape Drive/Silo
x4
x2
x2
10 GE(future)
1.5 GHz | 128 GB+
1 batch node w/ 256 GB 1.7 GHz | 16 GB
1 GE(current)
1.7
1.5
(5)
(171)
(7)
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
SANergy Data Movement
Orion
Terag
rid
Netw
ork
SAM-QFS DISK
2Gb
1Gb x 41Gb x 4
p690
Federation Switch
SAN Switch Infrastructure
2Gb x 4
SANergy MDC
Metadata operations, NFS
Data operations
SANergy client
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
~400 Sun FC Disk Arrays (~4100 disks, 540 TB total)32 FC Tape Drives
Sun Fire 15K
DataStar176 P655s
SAM-QFS ETF DBSAN-GPFS
5 x Brocade 12000 (1408 2Gb ports)
DataStar 11 P690s
SA
Ner
gy C
lient
SA
Ner
gy S
erve
r
Force 10 - 12000
HPSS
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Porting issues
• We moved from default 32bit to default 64bit• Fairly easy to port from BH to DS• Mixed Fortran + C/C++ codes give some trouble
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Initial Setup and difficulties
• Substantial jump in weight, power and cooling load from BH to DS
• Memory and performance leak • Fixed through NFS automounts • Removing unnecessary daemons
• Problems related to GPFS over SAN• IBM FC adapters, Brocade switches and SUN disks
• Loss of processors, memory, cache• HMC issues• Federation issues
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Large scale preproduction computing
• 100k to 200k hours per project• Onuchic, UCSD: Study the folding kinetics of a beta hairpin at room
temperature in explicit water • Yeung - 2048**3 turbulence run • Goodrich, BU - 3D calculation of the shearing and eruption of solar
active regions on 201 x 251 x 251 mesh • Richard Klein, UCB: dynamical evolution, gravitational collapse and
fragmentation of large turbulent molecular cloud in the galaxy • NREL, Cornell: Cellulose project, 1 million atom CHARMM
simulation of protein interactions
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Performance analysis
• All codes are 64bit compiled• Still some problems, so the results may not be
the best• All the runs are done on 1.5GHz P655s• Communication performance• NAS benchmarks CG, FT, LU and MG• Applications ENZO and CHARMM
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Latency and Bandwidth comparison
MPI Latencies (usec) Bandwidth (MB/s) BH DS BH DS• Intra-node 12.68 3.9 512.2 3120.4
• Inter-node 18.6 7.56 353.6 1379.1
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Bisection bandwidth
colony, federation comparison
0
20
40
60
80
100
0 200 400 600 800 1000 1200
No. PEs
GB/s
colony
federation
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Barrier performance (Federation)
MPI_Barrier
0
20
40
60
80
100
120
140
0 100 200 300 400 500 600
No. PEs
Tim
e(us
ec)
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
MPI broadcast performance
MPI_Bcast
1
10
100
1000
10000
No. of PEs
Tim
e (u
sec)
0 Bytes
16kB
1 MB
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Unusual Bcast behavior
Bcast (1024 PE)
0
500
1000
1500
2000
2500
3000
0 20000 40000 60000 80000 100000 120000 140000
message size (Bytes)
Tim
e (
us
ec
)
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
MPI_Alltoall
MPI_Alltoall
1
10
100
1000
10000
100000
1000000
10000000
1 2 4 8 16 32 64 128 256 512
No. PEs
Tim
e(u
sec)
0Bytes usec
16kB usec
1MB usec
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
• Understanding the raw MPI call performance (message size, No. of PEs) is useful to interpret real application MPI traces
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
GPFS performance
• Metadata performance• File read, write performance• MPI_tile_io, IOR, Pallas MPI I/O, MDTEST• Each processor writes one rectangular tile
• 16 bytes per element• each tile is 1000 x 1000 elements
• 1024PE run writes 32X32 tiles: 16GB file
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
GPFS Performance (MPI I/O)
MPI_tile_io
0500
1000150020002500300035004000
0 200 400 600 800 1000 1200
No. PEs
MB
/s
Reads
Writes
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
NAS Benchmarks: Strong scaling
NAS CLASS C
0
50000
100000
150000
200000
250000
300000
0 100 200 300 400 500 600
No.of PEs
MF
LO
PS
BH_CG
DS_CG
BH_FT
DS_FT
BH_LU
DS_LU
BH_MG
DS_MG
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
NAS benchmarks: strong scaling
% of peak Flops comparison
0
5
10
15
20
0 100 200 300 400 500 600
No. of PEs
%p
eak
FL
OP
S
BH_CG
DS_CG
BH_FT
DS_FT
BH_LU
DS_LU
BH_MG
DS_MG
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
NAS benchmarks: strong scaling
• DS %peak is more flatter than BH owing to the better bandwidth
• Initially %peak increases due to cache effects and then drops due to communication over heads
• Sparse matrix codes give the worst %peak and dense matrix codes gives the best %peak
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Application benchmarks
• ENZO: Astrophysics code • Consumes 1 Million hours on Datastar
• CHARMM: MD chemistry code• Both are Community codes
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Original Enzo performance on DS,BG/L
1,000
10,000
100,000
10 100 1,000
Processors
Spe
ed/p
roce
ssor
(ce
ll-st
eps/
s-pr
oc)
DS 256̂ 3
BG/L 1p/n 256̂ 3
BG/L 2p/n 256̂ 3
DS 512̂ 3
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
Improved Enzo performance on BH and DS
1,000
10,000
100,000
10 100 1,000
Processors
Spe
ed/p
roce
ssor
(ce
ll-st
eps/
s-pr
oc)
DS 512̂ 3DS 256̂ 3
BH 256̂ 3
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
CHARMM performance
• 64bit compilation• Rarely been run beyond a single node• Previously it didn’t scale beyond 32 PEs• Short hand written FFTs need to be optimized
further• Initiated a tuning project with the developers
targeting large scale simulations
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
summary
• Datastar is rapidly becoming stable with good performance.
• Switch routing issues remaining
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE
SAN DIEGO SUPERCOMPUTER CENTER
1Million atom CHARMM run
0
100
200
300
400
500
0 50 100 150
No.of PEs
Tim
e (s
ec)
dynamics
total