Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats...
Transcript of Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats...
![Page 1: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/1.jpg)
DDN Confidential.
Do NOT reproduce or distribute
Paving the way for Exascale:
Lessons learn from I/O
accelerators
Extreme Scale Demonstrator, Prague May, 2016
Jean-Thomas Acquaviva, DDN
![Page 2: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/2.jpg)
2
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Corporate Status: DDN Advanced Technical Center
R&D centered on Emerging tech. programs, Paris, France 25+ R&D engineers
![Page 3: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/3.jpg)
3
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
I/O Bandwidth Requirements As seen from checkpoint restart needs
Bandwidth needs next-gen pre-Exascale systems
Rules of thumb:
1/ Checkpointing less than 6 minutes per hour
2/ Checkpointing means draining half of system memory
Pre-Exascale system:
4 Petabyte → bandwidth requirement 5.6 TB/s
Oakridge lab., Teng Wang, Weikuan Yu et al. “ An Efficient Distributed Burst Buffer for Linux”, LUG 2014
![Page 4: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/4.jpg)
4
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Irregular I/O Bandwidth Pressure
99% of the time the IO sub-system is stressed bellow 30% of its bandwidth
70% of the time the system is stress under 5% of its peak bandwidth Argone lab. P. Carns, K. Harms et al., Understanding and Improving Computational Science Storage Access through Continuous Characterization, 2011
![Page 5: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/5.jpg)
5
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
What is IME? Distributed Virtually Shared Coherent Array of SSDs
SSD reshuffles the parameters
Latency / 40 : 4ms → 0,1 ms
Bandwidth x 3: 150 → 450 MB/s
Capacity / 8 : 8 → 1TB.
Cost x 10 $ 0,05/Gbit → $0.04
What can we do with a costly high
bandwidth low latency technology?
![Page 6: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/6.jpg)
6
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Dealing with System Complexity
Sporadic IO traffic
leads to difficult routing
?
Courtesy Philip Brighten, U. Illinois
![Page 7: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/7.jpg)
7
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
State of the Art Storage Architecture
Courtesy Philip Brighten, U. Illinois
1PF Compute Cluster
Burst Buffer with DDN Infinite Memory Engine (IME) at 500 GB/s Performance
DDN EXAScaler (Lustre) For Scratch
4PB at 200 GB/s performance
EDR Infiniband N/w
GS-WOS Bridge
PFS Stats Collection & Monitoring
DDN GRIDScaler For Home & Nearline
3.5PB at 100 GB/s performance
WOS over
10GbE
NAS Gateways & Data Transfer Nodes
MetroX
5PB DDN WOS Object Storage Archive
Remote Login Nodes at NUS
MetroX
Remote Login Nodes at NTU
![Page 8: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/8.jpg)
8
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
I/O proxys act as traffic aggregators: routing is easier
I/O proxy Storage Multicore Manycore GPU
Exascale as a System of Systems
![Page 9: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/9.jpg)
9
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
From Monitoring to Orchestration
DDN DIO-pro 2016
![Page 10: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/10.jpg)
10
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Temporal and Spatial Patterns...
lDonald J. Hatfield, Jeanette Gerald: Program Restructuring for Virtual Memory. IBM
Systems Journal, 10 (3): 168-192 (1971)
Time
![Page 11: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/11.jpg)
11
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Temporal and Spatial Patterns... are here to stay
Source: Storage Models: Past, Present, and Future. Dres Kimpe et Robert Ross, Argonne National Laboratory
![Page 12: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/12.jpg)
12
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Conclusion: Storage Evolution
Storage getting closer to the CPU
Mechanically same needs will arise
Tools convergence
Access latency put pressure on the software design
→ window of opportunity to drastic redesign
![Page 13: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/13.jpg)
13
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Early IME I/O Accelerator feed-back
Harnessing distributed HW resources
→ From Fault tolerance to QoS
Hierarchical storage
→ Narrowing the gap between Storage and Processing
→ Inter-Operable
→ Software only solution are versatile
→ System wide profiling
→ Data policies
→ Orchestration by job scheduler
System of Systems: Think Out of the Box!
![Page 14: Paving the way for Exascale: Lessons learn from I/O ... - Jean... · GS-WOS Bridge PFS Stats Collection & Monitoring DDN GRIDScaler For Home & Nearline 3.5PB at 100 GB/s performance](https://reader033.fdocuments.in/reader033/viewer/2022042909/5f3a4fdcc46a03753a1134c6/html5/thumbnails/14.jpg)
14
ddn.com © 2016 DataDirect Networks, Inc. * Other names and brands may be claimed as the property of others.
Any statements or representations around future events are subject to change.
Merci !
Thank you ! Grazie
Gracias
спасибо
ありがとう
谢谢