Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National...
-
Upload
maryann-flowers -
Category
Documents
-
view
217 -
download
2
Transcript of Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National...
![Page 1: Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National Laboratories, California Maya GokhaleLawrence Livermore National.](https://reader036.fdocuments.in/reader036/viewer/2022083009/5697bf861a28abf838c884fd/html5/thumbnails/1.jpg)
Threading Opportunities in High-Performance Flash-Memory Storage
Craig Ulmer Sandia National Laboratories, California Maya Gokhale Lawrence Livermore National Laboratory
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000
![Page 2: Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National Laboratories, California Maya GokhaleLawrence Livermore National.](https://reader036.fdocuments.in/reader036/viewer/2022083009/5697bf861a28abf838c884fd/html5/thumbnails/2.jpg)
Revolutionary Storage Technologies
• Storage-Intensive Supercomputing (SISC) at LLNL– System architectures for applications with massive datasets
– New technologies: processing elements, networks, and storage
• NAND-Flash storage in high-performance computing– Flash chips have great potential: 100x access times, 10x bandwidth
– However, few commercial products have delivered performance
• Exception: Fusion-io’s ioDrive– PCIe x4 card with 80-320 GB of flash
– Theoretical read speed of 700 MB/s
– Hardware allows many IOPs to be in-flight concurrently
![Page 3: Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National Laboratories, California Maya GokhaleLawrence Livermore National.](https://reader036.fdocuments.in/reader036/viewer/2022083009/5697bf861a28abf838c884fd/html5/thumbnails/3.jpg)
Threaded I/O Microbenchmarks
• Observation: Increasing number of IOPs improves performance– Opposite of what we expect from hard drives
– Due to flash memory packaging: chip is actually a die stack
• Implemented a set of I/O microbenchmarks to investigate – Threaded with mixed I/O characteristics (mostly read-only)
– Currently: Block transfer, kNN, external sort, binary search
• Example: k-Nearest Neighbors (kNN)– Stream through all training vectors and find
k vectors that are most similar to each input vector
– Each thread works on portion of training data
![Page 4: Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National Laboratories, California Maya GokhaleLawrence Livermore National.](https://reader036.fdocuments.in/reader036/viewer/2022083009/5697bf861a28abf838c884fd/html5/thumbnails/4.jpg)
kNN Results
• Single ioDrive vs. Three SATA hard drives in RAID0– ioDrive provided a 3x improvement to end application
– Small number of threads can have large impact with flash memory
0
20
40
60
80
100
120
140
160
1 2 4 8 16 32 64
Input Vectors per Pass
Sin
gle
Pa
ss T
ime
(s
)
SATA RAID - 1 ThreadSATA RAID - 2 ThreadsSATA RAID - 4 ThreadsioDrive - 1 ThreadioDrive - 2 ThreadsioDrive - 4 Threads
ioD
riveS
AT
A
0
50
100
150
200
250
1 2 4 8 16 32 64 128
Threads
Sin
gle
Pas
s T
ime
(s)
SATA RAID - 32 Vectors per Pass
ioDrive - 32 Vectors per Pass
Input Vectors Threads
Tim
e (s
)
Tim
e (s
)