PowerPoint icon SDSC HEC DATA Environment
-
Upload
datacenters -
Category
Technology
-
view
483 -
download
2
Transcript of PowerPoint icon SDSC HEC DATA Environment
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
SDSC’ s HEC + Data environment
Giri Chukkapalli San Diego Supercomputer
CenterJul 25, 2005
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Outline
• SDSC’s mission• HEC + Data target application space• HEC + Data infrastructure supporting target
application space• Future improvements
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
SDSC background
• National NSF center with compute and data resources allocated freely through peer review process
• Transitioning from NPACI to CyberInfrastructure through Teragrid
• SDSC’s piece in the national CiberInfrastructure puzzle is “Data intensive computing”
• A 10TFLOPS SP4 “Datastar” is the main compute engine: soon to be doubled
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Vector/
SMPMPPs
Loosely coupled
clusters
Work stations
Data
Engines
servers
Web
server
Sensors
instruments
N E T W O R K / D A T A T R A N S P O R T L A Y E R
GLOBUS LAYER
Grid middleware bridge software, schedulers etc.
Problem Solving Environments portals, UIs, web services
Operating Systems, Compilers, Oracle TOMCAT A/D
Life Sciences Engineering Environmental Astrophysics Etc.
Bioinformatics Automotive/ Climate/
Aircraft Weather
Hardware
Complex
Systems
Domain Specific
Resource Specific
Cyber
Infrastructure
Cyber Infrastructure
Tools
libraries
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Da
ta
(Inc
reas
ing
I/O
and
sto
rage
)
Compute(increasing FLOPS)
SDSC Data Science Env
Campus, Departmental and
Desktop Computing
Traditional HEC Env
QCD
Protein Folding
TurbulenceReattachment
length
CHARMMGaussian
CPMD
NVOEOL
Cypres
SCECPost-processing
Data Storage/Preservation Env Extreme I/O Environment
1. Time Variation of Field Variable Simulation
2. Out-of-Core
SDSC’s focus: Apps in top two quadrants
ENZOPost-precessing
CFD
Turbulencefield
Climate
SCECSimulation ENZO
simulation
Can’t be done on Grid(I/O exceeds WAN)
Distributed I/OCapable in future
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Da
ta
(Inc
reas
ing
I/O
and
sto
rage
)
Compute(increasing FLOPS)
SDSC Data Science Env
Departmental & Desktop Computing
Traditional HEC Env
Increasing $$
$M $$M $$$M
Constant $ Buys Different Solutions
Blue Gene 1:8 I/O-Compute
Extreme data infrastructure
Blue Gene 1:64 I/O-Compute
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Data intensive applications are not monolithic
• Can span Compute .vs. interconnect space• Can span Compute .vs. memory space• Memory bandwidth is essential for almost all
for data intensive applications• SDSC with its hardware infrastructure
strategy intend to cover this broad data intensive application space
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Compute (increasing FLOPS)
Inte
rcon
nect
(I
ncre
asin
g by
tes/
sec)
Campus, Departmental and Desktop Computing
Global shared memory machine MIMD machine
Application SpaceInterconnect vs Computing
Peudo-spectralMultipole
AMRElectrostatics
Gravity
Clu
ster
farm
Spectral elementMultipole
Replica exchangeMonte CarloOptimization
Finite elementFinite difference
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Compute (increasing FLOPS)
Mem
ory
Ban
dwid
th
(Inc
reas
ing
byte
s/se
c)Global shared memory machine MIMD machine
Application Space: Memory vs Computing
Radiation hydroClimate
Campus, Departmental and Desktop Computing
Dense matrixExponential
Stencil based methods
Sparse matrixClustering/SortingContact dynamics
AMR
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Target application characteristics
• Consumes, processes and generates large amounts of data
• High bytes/Flops ratio• IO intensive applications
• that can tolerate latencies• that can not tolerate latencies
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Data intensive application space
• Out of core simulations where working data set doesn’t fit in the memory
• Adjoint methods where governing equation based simulations are driven by continuous absorption of empirical data
• Broad class of data mining type of applications
• Standard time evolution of field variable based simulations which writes out time histories
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Applications consuming External data
• Gathering and processing of data from sensor networks• Climate/weather data• Medical imaging data• Particle physics data• Astronomy data• Data from earthquake sensor networks
• Providing data environment for processing at other centers
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Internal data applications
• Preprocessing, simulation and post processing
• Visualizing large scale 3D and 4D data• Workflow based simulations where different
parts of the workflow are mapped to different compute resources• Coupled multi-physics applications
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
External consumers of internal data
• Serving Collections• Serving Simulation data• Processing data stored at SDSC by other
centers
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Data Science Applications Matrix
10
200
100s
20
1
1000
5
10
25
10-100
Data sizeTB
Post-processin
g
Parallel I/O
25k files - 512cpus, 1 hourLinguistics
Hyper-atlases, 5 mil filesNVO
240cpus,1.8bil cube, 4 min quake with timestep 27k
SCEC
1024^3, 1000s procsENZO
1000s procs,Climate-multiple
MRI exam 20hr -> 10minMedical MRI
CPUs depends on WAN BW
HEP
2048^3, 1024cpus,62kstepDNS Turbulence
512procs BG/L, 10k stepsLES Turbulence
1014procs/3d on BG/LSpecFEM3D
Multiple 800x800x200km runs,2000sec with 0.01sec
GEON
MetricData Parking
Extreme I/O
Name
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
HEC + Data environment
• Hardware, software environment facilitating smooth flow of data from its origin to destination without bottlenecks
• Much more than a single supercomputer with lots of FLOPS
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Infrastructure to move data in and out of SDSC
• Teragrid hardware infrastructure• 30Gbs teragrid backbone and expandingnetworks• GFS global file system
• Teragrid joining other grid initiatives • EU grid• Open Science Grid• Asian grid etc.
• Software infrastructure• SRB• Grid tools like globus url copy
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Infrastructure to move data within the center
• 500TB SAN storage mounted across major compute resources
• 500TB of GFS mounted across SDSC as well as at other centers
• 1 petabyte each of HPSS and SAMQFS tape archive space accessible to major compute resources
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
HEC platforms supporting Data intensive computing
• Datastar• Large shared memory and MIMD nodes are on the same
interconnect and parking file systems• Tightly coupled: low latency high bandwidth interconnect• Supports large SAN based parallel file system• Reasonably fast access to HPSS archive from large
shared memory nodes (P690s)
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
SDSC DataStar
187 Total Nodes11 p690
176 p655TeraGrid network to L .A.
30 Gb/s
HPSS
SAMQFS
NFS GPFS
Gigabit Ethernet
Login (1)Interactive (1)
DatabaseBatch
p690 Nodes
Storage AreaNetwork(SAN)
Interactive (171)Batch (5)
Federation Switch p655 Nodes
Tape Drive/Silo
x4
x2
x2
10 GE(future)
1.5 GHz | 128 GB+
1 batch node w / 256 GB 1.7 GHz | 16 GB
1 GE(current )
1.7
1.5
(5)
(171)
(7)
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
SANergy Data Movement
Orion
Te
rag
rid N
etw
ork
SAM-QFS DISK
2Gb
1Gb x 41Gb x 4
p690
Federation Switch
SAN Switch Infrastructure
2Gb x 4
SANergy MDC
Metadata operations, NFS
Data operations
SANergy client
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
~400 Sun FC Disk Arrays (~4100 disks, 540 TB total)32 FC Tape Drives
Sun Fire 15K
DataStar176 P655s
SAM-QFS ETF DBSAN-GPFS
5 x Brocade 12000 (1408 2Gb ports)
DataStar 11 P690s
SA
Ner
gy C
lien
t
SA
Ner
gy S
erve
r
Force 10 - 12000
HPSS
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
HEC platforms supporting Data intensive computing
• Teragrid• Faster processors and memory• Better global connectivity• Not as good an interconnect• SAN based parallel file system• Connectivity to parking space and global GFS space
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
HEC platforms supporting Data intensive computing: I/O rich BG/L frame
• BlueGene/L (Optimized for data-intensive computing)• 5.7 Tflops Peak• 512 GB memory• 16 GBps aggregate I/O rate• 1024 compute nodes• 128 I/O nodes• Node
• 2 – 700 MHz PowerPC processors• 512 MB memory• 3D Torus and Global tree connections
• GPFS parallel file system over half a petabyte of SATA disk
Installed at SDSC in Dec, first outside of IBM or LLNL
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
SDSC’ s single-rack system has 128 I/O nodes
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
MPI Effective bandwidth
MPI Effective bandwidth
0
20
40
60
80
100
0 500 1000 1500 2000 2500
No. PEs
GB
/s
BH
DS
BG-VN
BG-CO
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Network performance of BG compares favorably to p655s, especially when bandwidth is normalized by
p/node & clock;absolute node-to-node latency is better for BG
DS p655s BG VN DS p655sBenchmark Units 8p/node 2p/node / BG
Intranode PingPong bw (MB/s) 2,404 344 6.99Node-to-node PP bw (MB/s) 1,428 159 8.99Intranode PP latency (µs) 2.2 3.1 0.69Node-to-node PP latency (µs) 6.2 4.3 1.46
Measured with HPC Challenge benchmark using1024p for node-to-node results
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
I/O performance with GPFS has been measuredfor two benchmarks & one application;
max rates on Blue Gene are comparable to DataStar for benchmarks,
but slower for application in VN mode
2048pDS p655s BG CO BG VN BG VN
8p/node 8p/IO node 16p/IO node 16p/IO nodeCode & quantity (MB/s) (MB/s) (MB/s) (MB/s)
IOR write 1,793 1,797 1,478 1,585IOR read 1,755 2,291 2,165 2,306mpi-tile-io write 2,175 2,040 1,720 1,904mpi-tile-io read 1,698 3,481 2,929 2,933mpcugles write 1,391 905 387
IOR & mpi-tile-io results are on 1024p, except for last columnmpcugles results are on 512p
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
IOR weak scaling scans with GPFS showBG has higher max for reads (2.3 vs 1.9 GB/s), while
DS has higher max (than BG VN) for writes (1.8 vs 1.6 GB/s)
10
100
1,000
10,000
1 2 4 8 16 32 64 128 256 512 1024 2048
Processors
I/O r
ate
(M
B/s
)
DS peak
DS writeDS read
BG CO peak
BG CO writeBG CO read
BG VN peak
BG VN writeBG VN read
DataStar p655s (8p/node) & Blue Gene (CO:8p or VN:16p per I/O node) Noncollective read/write via IOR (512MB/p or 256MB/p)(-a POSIX -e -b 512m -t 1m or -b 256m)
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Example of data flow supporting data intensive app
• Using the Grid infrastructure bring experimental data into the parking space or GFS
• Using large memory P690 nodes • Interpolate the fields on to the mesh (Initial, boundary
conditions,• partition the domain and generate input files on GPFS
• Run the simulation • Move the files back to parking space for
post processing, visualization and analysis• Archive and share the important results
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
summary
• SDSC provides broad data intensive computing and data movement infrastructure
• We will be glad to port and help to characterize your data intensive HEC application• Strategic Applications Collaboration
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Future improvements
• Data allocations• Availability of parking space• Software tools to move data more efficiently• Co-scheduling of compute, data and network
resources• Infrastructure to support more complex
workflows• Web services based “science gateways” to
bring supercomputing to wider communities
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA, SAN DIEGO
Pushing the Data-Intensive Envelope
MemoryParallel
File SystemData Parking
Archival TapeSystem
C
O
M
P
U
T
E
R
Today’s leading-edge
1 GB/s 100 MB/s1 GB/s
4 TB 60 TB 100 TB 10 PB
2 TB/s
15 TF
Tomorrow’s demands
100 GB/s 100 GB/s 10 GB/s
10 TB 3 PB 10 PB 100 PB
10 TB/s
100 TF