HPC Storage Integration - Dell NFS Storage Solution · PDF fileHPC Storage Integration - Dell...
Transcript of HPC Storage Integration - Dell NFS Storage Solution · PDF fileHPC Storage Integration - Dell...
HPC Storage Integration - Dell NFS Storage Solution (NSS) Overview
Onur Celebioglu HPC Advisory Council Workshop [email protected]
Dell HPC
The Dell HPC NFS Storage Solutions (NSS) Agenda
• When is NFS needed?
• Where does NFS fit?
– NFS as a function of scale and throughput
• HPC Challenges
• Development of NSS
– What does Dell NSS do for you?
• Starting point for NSS configurations
• NSS Configurations
• NSS Performance
Confidential 2
Dell HPC
When is NFS needed?
• Virtually all clusters, regardless of size, need a shared file system
– At the very minimum for /home and applications
• For small to medium systems NFS can also serve as primary file system for jobs
– No need for high-speed scratch file system for most application profiles
• Even for large systems, NFS can serve user /home data and applications
– No need for performance – just reliability and ease of use/management
Confidential 3
Dell HPC
HPC Challenges: Complexity, Performance, Cost
4 Confidential
• Challenge #1 is Compute
› Scalable applications
› Simplified deployment
› Cost effective nodes
• Challenge #2 is the Interconnect
› Low latency networks
› Robust management tools
› Costs as a % of node cost
› Challenge #3 is Storage & I/O
› Common file system
› Good performance and cost effective
› Reliable, ease to configure and manage
› Performance Tuned Configurations
The NSS addresses this
Dell HPC
The Dell HPC NFS Storage Solution
Confidential 5
NFS Gateway
… Storage – MD1200
Expansion
MD1200’s
• Takes the guesswork out of NFS configurations – Appliance approach to inexpensive NFS solutions
• Range of capacity: – 24TB – 96TB in a single namespace
• Good performance – 240 MB/s to 1.45 GB/s for NFS performance
– 6Gbps SAS, optional IB or 10GigE
– Tuned storage and file system configurations
• Cost Effective
• Reliable and supported – Proven hardware
– 3 years support with Dell including XFS support
– Redundant power supplies, connections, plus drive spares kit
• Easy to install – Dell configuration and deployment: Whitepaper and Dell PS
– Affordable custom installation services available
Dell HPC
Development of Dell NSS
• NSS Goals/Requirements: – No proprietary hardware/software – true Open Storage
– Appliance approach
• Approach: Dell examined options in terms of: – File systems, RAID levels and configuration
– LVM configuration
– File system and server OS tuning
– NFS server configuration
• Results: – NFS Gateway
› Dell PowerEdgeTM R710
› Extreme reliability, cost effectiveness, expandability
– Storage: › Dell PowervaultTM MD1200
› Great performance (6 Gbps SAS), reliability, and $/GB
– A baseline set of configurations › Great combination of performance and reliability
› Three defined configurations based on capacity
Confidential 6
Dell HPC
RAID Configurations: Seq Write
0
200
400
600
800
1000
1200
1400
6 Dr 12 Dr 24 Dr 36 Dr 48 Dr
Th
rou
gh
pu
t (M
B/s
)
MD1200 + H800, NL SAS, Seq Write, DD
R6/6
R5/6
R6/12
Dell HPC
RAID Configurations: Seq Read
0
200
400
600
800
1000
1200
1400
1600
1800
2000
6 Dr 12 Dr 24 Dr 36 Dr 48 Dr
Th
rou
gh
pu
t (M
B/s
)
MD1200 + H800, NL SAS, Seq Read, DD
R6-6
R5-6
R6-12
Dell HPC
RAID Configurations: Reliability Modeling
o R6/12 MTTDL is much higher than R5/6 ~27,000 years at 84 drives
0
20
40
60
80
100
120
140
160
6 Drs 12 Drs 24 Drs 36 Drs 48 Drs 60 Drs 72 Drs 84 Drs
MT
TD
L (
Ye
ars
)
Reliability, R50 based on R5/6, 2TB/Drive Assumptions:
2TB Drives
MTTF of disk: 600K hours
Hot spare drives
bit error rate: 10-15
Dell HPC
Benefits of Dell NSS
• Performance tuned NFS server – Best possible performance
– No need to experiment with tuning options – already tuned
Confidential 10
0
200000
400000
600000
800000
1000000
1200000
1400000
2 4 8 12 16 24 32
Th
rou
gp
ut
KB
/s
Clients
tuned
not tuned
30%
Dell HPC
Configuration Starting Point: How much capacity? What network?
12 Confidential
24TB 48TB 96TB
0
200
400
600
800
1000
1200
1400
1600
NSS Small NSS Medium NSS Large
NF
S T
hro
ug
hp
ut
(MB
/s)
10GigE Read
10GigE Write
InfiniBand Read
InfiniBand Write
Dell HPC
NFS Gateway
13
QDR IB
Confidential
Dell PowerEdge R710: (NFS Gateway) • (2) 2.4 GHz Intel Westmere CPUs (4 cores)
• 24 - 48GB memory
• Varies by configuration
• RAID-1 OS w/ hot spare
• RAID-0 swap space (2 drives)
• 3 years support including FS
• PERC H800
• 1GB cache battery-backed
• RAID-6 or RAID-60
• Varies by configuration
• Tuned LVM for certain configurations
• IB or 10GigE data interface
• GigE management connection
Up to 30% Better Performance
of NSS vs. Untuned
Dell HPC
Software
• Client:
– NFSv3 compatible client
– If using InfiniBand on the clients:
› OFED 1.5.1
• NSS Server:
– Redhat Enterprise Linux (RHEL) 5.5
– NFSv3
– Redhat Scalable File System: XFS version 2.10.2-7
– If using IPoIB:
› OFED 1.5.1
– LVM
Confidential 14
Dell HPC
NSS Small Configuration: 24TB
15
QDR IB
MD1200: (12) 2TB 7.2K NL-SAS
Confidential
Raw capacity: 24TB
Expandable to 96TB
Formatted capacity: ~20TB
Expandable to ~80TB
RAID-6
10GigE NFS Performance
Peak Sequential Read: 275 MB/s
Peak Sequential Write: 550 MB/s
InfiniBand NFS Performance (IPoIB)
Peak Sequential Read: 440 MB/s Peak Sequential Write: 890 MB/s
Summary
Dell HPC
NSS Medium Solution: 48 TB’s
16
QDR IB
MD1200: (12) 2TB 7.2K NL-SAS
MD1200: (12) 2TB 7.2K NL-SAS
Raw capacity: 48TB
Expandable to 96TB
Formatted capacity: ~40TB Expandable to ~80TB
RAID-60
RAID-6 within each MD1200
RAID-0 across MD1200’s
10GigE NFS Performance: Peak Sequential Read: 490 MB/s Peak Sequential Write: 840 MB/s
InfiniBand NFS Performance Peak Sequential Read: 755 MB/s Peak Sequential Write: 1,350 MB/s
Summary
Confidential
Dell HPC
NSS Large Solution: 96 TB’s
17
QDR IB Raw capacity: 96TB
Formatted capacity: ~80TB RAID-60 and LVM RAID-6 within each MD1200
RAID-0 across MD1200’s
LVM to combine LUNS
10GigE NFS Performance Peak Sequential Read: 850 MB/s
Peak Sequential Write: 1,180 MB/s
InfiniBand NFS Performance Peak Sequential Read: 1,350 MB/s
Peak Sequential Write: 1,470 MB/s
Summary
Confidential
Dell HPC
10GigE NFS Performance: Sequential Read
Confidential 20
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1 2 4 8 16 24 32
Th
rou
gh
pu
t K
B/s
Threads (Nodes)
NFS 10GigE Sequential Reads
NSS Small
NSS Medium
NSS Large
• 10GigE with NFS gateway with GigE clients
• Performance Peaks: – NSS Small: 8 nodes doing IO
– NSS Medium: 24 nodes doing IO
– NSS Large: No peak over range tested
Dell HPC
10GigE NFS Performance: Sequential Write
Confidential 21
0
200000
400000
600000
800000
1000000
1200000
1400000
1 2 4 8 16 24 32
Th
rou
gh
pu
t K
B/s
Threads (nodes)
NSS 10 GigE Sequential Writes
NSS Small
NSS Medium
NSS Large
• 10GigE with NFS gateway with GigE clients
• Peaks: – NSS Small: 8 nodes doing IO
– NSS Medium: 8 nodes doing IO
– NSS Large: 16 nodes doing IO
Dell HPC
InfiniBand (IPoIB) NFS Performance: Sequential Read
Confidential 22
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1 2 4 8 16 24 32
Th
rou
gh
pu
t K
B/s
Threads (Nodes)
NSS IPoIB Sequential Reads
NSS Small
NSS Medium
NSS Large
• IPoIB NFS gateway with QDR IB clients
• Peaks: – NSS Small: 1 node doing IO (fairly level until 4-8 nodes)
– NSS Medium: 4 nodes doing IO (not much drop-off)
– NSS Large: 8 nodes doing IO (good performance over range)
Dell HPC
Infiniband (IPoIB) NFS Performance: Sequential Write
Confidential 23
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1 2 4 8 16 24 32
Th
rou
gh
pu
t K
B/s
Threads (Nodes)
NSS IPoIB Sequential Writes
NSS Small
NSS Medium
NSS Large
• IPoIB NFS gateway with QDR IB clients
• Peaks: – NSS Small: 1 node doing IO (steady drop off to 16 nodes)
– NSS Medium: 2 nodes doing IO (good performance for up to 8-16 nodes)
– NSS Large: 4 nodes doing IO (good performance over range of nodes tested)
Dell HPC
Random Read IOPS Performance 10GigE and Infiniband
• Both 10GigE and IB have about the same performance – Performance dictated by controllers and disks, not network
Confidential 24
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
1 2 4 8 16 24 32
IOP
S
Threads (Nodes)
NSS IPoIB Random Read IOPS
NSS Small
NSS Medium
NSS Large
Dell HPC
Random Write IOPS Performance 10GigE and Infiniband
• Both 10GigE and IB have about the same performance – Performance dictated by controllers and disks, not network
Confidential 25
0
500
1000
1500
2000
2500
1 2 4 8 16 24 32
IOP
S
Threads (Nodes)
NSS IPoIB Random Write IOPS
NSS Small
NSS Medium
NSS Large
Dell HPC
Metadata – File Create
Confidential 26
0
5000
10000
15000
20000
25000
30000
1 2 4 8 16 32
IOP
S
Nodes (Threads)
NSS IPoIB - Metadata - File Create
Small
Medium
Large
Dell HPC
Observations
• Performance of IB (IPoIB) for single node is excellent
– One reason is that clients for 10GigE testing are using GigE instead of IB
• For most node counts, IPoIB is better than 10GigE
• Many applications have just one MPI process doing IO
– You only have a very small number of nodes performing IO
– In these cases, using NFS over IB (IPoIB) will greatly help your I/O performance
Confidential 27
Dell HPC
NSS-HA Large Solution: 96 TB’s
28
Raw capacity: 96TB
Formatted capacity: ~80TB RAID-60 and LVM RAID-6 within each MD enclosure
HA-LVM to combine LUNS
InfiniBand NFS Performance Peak Sequential Read: 2.4 GB/s
Peak Sequential Write: 1.3 GB/s
Summary
Confidential
QDR IB QDR IB
Dell HPC
NSS-HA Large Performance
29 Confidential
• Improved Read performance due to embedded RAID controller
• Writes around 1.3 GB/s with write cache mirroring enabled
Dell HPC
Summary
• Virtually all clusters, regardless of size, need a shared file system
– For small to medium systems NFS can also serve as primary file system for jobs
– Even for large systems, NFS can serve user /home data and applications
• Dell NSS is designed to take the guesswork out of NFS configurations
• Three pre-configured systems – 24, 48TB, 96TB’s, QDR or 10GigE connection to network
– Range of capacity
– Tuned configurations (good performance)
– Cost Effective
– Easy to configure
– Affordable Dell deployment/installation available
Confidential 31
Thank you!
http://i.dell.com/sites/content/business/solutions/hpcc/en/Documents/Dell-NSS-NFS-Storage-solution-final.pdf