A comparison of distributed data storage middleware for HPC, GRID and Cloud
description
Transcript of A comparison of distributed data storage middleware for HPC, GRID and Cloud
A comparison of distributed data storage middleware for HPC, GRID and Cloud
Mikhail Goldshtein1, Andrey Sozykin1, Grigory Masich2 and
Valeria Gribova3
1Institute of Mathematics and Mechanics UrB RAS, Russia, Yekaterinburg
2Institute of Continuous Media Mechanics UrB RAS, Russia, Perm
3Institute of Automation and Control Processes FEB RAS, Russia, Vladivostok
European Middleware Initiative
EMI - Software platform for high performance
distributed computing, http://www.eu-emi.eu
Joint effort of the major European distributed computing
middleware providers (ARC, dCache, gLite, UNICORE)
Widely used in Europe, including Worldwide LHC
Computing Grid (WLCG)
Higgs boson:
•Alberto Di Meglio: Without the EMI middleware, such an important result could not have been achieved in such a short time
2
Storage solutions in EMI
3
dCache - http://www.dcache.org/
Disk Pool Manager (DPM) -
https://svnweb.cern.ch/trac/lcgdm/wiki/Dpm
StoRM (STOrage Resource Manager) - http://storm.forge.cnaf.infn.it/
dCache
4
Disk Pool Manager
5
StoRM
6
Usage statistics in WLCG
7
Distributed storage systems
Traditional approach:
• Grid
• Distributed file systems (IBM GPFS, Lustre File System, etc.)
Modern technologies:
• Standard Internet Protocols (Parallel NFS, WebDAV, etc.)
• Cloud storage (Amazone S3, HDFS, etc.)
8
Classic NFS
9
Parallel NFS
10
Comparison results
11
Feature dCache DPM StoRMGrid protocols SRM, xroot, dcap
GridFTPSRM, RFIO, xroot, GridFTP
SRM,RFIO, xroot, GridFTP, file
Standardprotocols
NFS 4.1, WebDAV
NFS 4.1, WebDAV
-
Cloud backend HDFS (in development)
HDFS, Amazon S3
-
Quality of documentation
High Medium High
Ease of administration
Easy Medium Easy
Distributed dCache based Tire 1 WLCG storage
12
Implementation
13
Implementation details
Hardware: 4 x Supermicro servers (3 in
Yekaterinburg, 1 in Perm), 210 TB useful
capacity (252 full capacity, RAID5 +
Hotspare are used)
ОС Scientific Linux 6.3
dCache 2.6 from EMI repository
Protocol: NFS v4.1 (Parallel NFS)
RHEL has a parallel NFS client, no need to
install additional software to clusters
14
Performance testing
15
IOR test (http://www.nersc.gov/systems/trinity-nersc-8-rfp/nersc-8-trinity-benchmarks/ior/)
Future works
Evaluation of NFS performance over 10GE and WAN
Evaluation of dCache in the experiments (Particle Image
Velocimetry and so on)
Participation in GRID projects:
• Grid of Russian National Nanotechnology Network
• WLCG (through Joint Institute for Nuclear Research, Dubna, Russia)
Connection to Hadoop Cluster (when dCache will support
HDFS)
16
Thank you!
Andrey Sozykin
Institute of Mathematics and Mechanics
UrB RAS, Russia, Yekaterinburg
17