Data Management, Storage and Access Optimization in High Performance Distributed Environment
description
Transcript of Data Management, Storage and Access Optimization in High Performance Distributed Environment
![Page 1: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/1.jpg)
January 17, 2001 Xiaohui Shen 1
Data Management, Storage and Access Optimization in High Performance
Distributed Environment
Xiaohui ShenDepartment of Electrical and Computer EngineeringNorthwestern UniversityJan 17, 2001
![Page 2: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/2.jpg)
January 17, 2001 Xiaohui Shen 2
Outline
Problem Definition Solutions
Meta-data Management System Remote Storage Access Optimizations Multi-Storage I/O System Distributed Parallel File System I/O performance prediction and evaluation Integrated working environment
![Page 3: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/3.jpg)
January 17, 2001 Xiaohui Shen 3
Motivation
![Page 4: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/4.jpg)
January 17, 2001 Xiaohui Shen 4
Current Solutions Parallel File System and runtime libraries: smart
I/O optimizations, caching, prefetching, parallel I/O User interfaces are low-level No portable Hard-coded I/O selection is difficult for runtime systems
Database Systems: high-level, easy-to-use, portable lack of power I/O optimizations
![Page 5: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/5.jpg)
January 17, 2001 Xiaohui Shen 5
System Architecture
![Page 6: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/6.jpg)
January 17, 2001 Xiaohui Shen 6
Tasks
Meta-data Management System Remote Storage Access Optimizations Efficient Storage Organization
• Multi-Storage I/O System• Distributed Parallel File System
I/O performance prediction and evaluation Integrated working environment
![Page 7: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/7.jpg)
January 17, 2001 Xiaohui Shen 7
Part 1: Meta-data Management System (MDMS)
Abstract Storage Devices (ASDs) Storage patterns & access patterns Access History and trail of
navigation
![Page 8: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/8.jpg)
January 17, 2001 Xiaohui Shen 8
MDMS Tables
TABLE NAME FUNCTIONALITY PRIMARY KEYRun table Record each run of the application
with user-specified attributesRun id
Dataset table Keeps information about the datasetsused each run
Run id + association id
Access patterntable
Keeps the access pattern specified byuser for each dataset
Run id + dataset name
Storage patterntable
Keeps information on how data storedfor each dataset
Dataset name
Execution table Records I/O activities of the run,including file path and name, offset,etc
Run id + dataset + iterationnumber
![Page 9: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/9.jpg)
January 17, 2001 Xiaohui Shen 9
MDMS Internal Representation
![Page 10: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/10.jpg)
January 17, 2001 Xiaohui Shen 10
MDMS I/O Flow (API)
![Page 11: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/11.jpg)
January 17, 2001 Xiaohui Shen 11
Optimizations inside MDMS
![Page 12: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/12.jpg)
January 17, 2001 Xiaohui Shen 12
Part 2: Remote Storage Access Optimization for HSS
Secondary Storage Access techniques: collective-I/O, data sieving, caching, prefetching etc
Tertiary Storage Systems directly interacts with applications
Remote environment
![Page 13: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/13.jpg)
January 17, 2001 Xiaohui Shen 13
Optimizations
Remote Collective I/O Remote Data sieving Asynchronous I/O Subfile Superfile Migration, Stage and Purge,
SRB Container
![Page 14: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/14.jpg)
January 17, 2001 Xiaohui Shen 14
Optimization: Subfile
Subfile Subfile
Subfile
Subfile
SubfileSubfileSubfileSubfile
Subfile
Subfile Subfile Subfile
Subfile Subfile Subfile Subfile
![Page 15: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/15.jpg)
January 17, 2001 Xiaohui Shen 15
Optimization: Superfile Create: One large file
Access: first access brings the whole large file into memory, subsequent accesses can be directly serviced from memory
![Page 16: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/16.jpg)
January 17, 2001 Xiaohui Shen 16
Other Optimizations
Migration Stage Purge SRB Container
![Page 17: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/17.jpg)
January 17, 2001 Xiaohui Shen 17
Part 3: MS-I/O: A Multi-storage I/O System
Further performance improvement is limited by the nature of storage media.
The problem is rooted in the traditional Single-storage resource architecutre.
![Page 18: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/18.jpg)
January 17, 2001 Xiaohui Shen 18
Solution: Multi-storage Resource Architecture
Increases logical storage capacity Provides a more flexible and reliable
computing environment Provides new opportunities for
further performance improvement
![Page 19: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/19.jpg)
January 17, 2001 Xiaohui Shen 19
Multi-storage Resource Architecture
![Page 20: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/20.jpg)
January 17, 2001 Xiaohui Shen 20
Experimental Environment
Local Postgres Database
Local Disks Remote Disks Remote Tapes
Compute resource: Argonne SP2
![Page 21: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/21.jpg)
January 17, 2001 Xiaohui Shen 21
Multi-storage I/O System
![Page 22: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/22.jpg)
January 17, 2001 Xiaohui Shen 22
Database Tables and I/O Routines Run table Dataset table Access
pattern table Storage
pattern table Execution
table
![Page 23: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/23.jpg)
January 17, 2001 Xiaohui Shen 23
User Access Pattern (write)
Field Description Value OptimizationsData Partition How data is partitioned
among processorsBBB, B**, BB, B*etc
Collective I/O
Write Size The size of the dataset Huge, large,medium, small
Data location,subfile, superfile
Write Sequence Whether there are asequence of data files(time steps)
Yes, no Superfile,asynchronous I/O
When Use When this dataset willbe accessed
Soon, long, never Data location, dataduplication
Use Frequency How often this datasetwill be accessed
Frequent, seldom,never
Data location, dataduplication
Compute Time Whether compute timeis significant part
Large, small Asynchronous I/O
Future ReadSize How large of datasetwill be accessed
Whole, partial subfile
FutureReadSequence
Will a sequence of datafiles will be accessed?
Yes, no Asynchronous I/O,superfile
Duration Data’s life time onstorage
Permanent,temporary
Data location,Data duplication
![Page 24: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/24.jpg)
January 17, 2001 Xiaohui Shen 24
User Access Pattern (read)Field Description Value Optimizations
Data Partition How data ispartitioned amongprocessors
BBB, B**, BB, B*etc
Collective I/O
Use Frequency How often thisdataset will beaccessed
Frequent, seldom,never
Data location, dataduplication
Compute Time Whether computetime is significantpart
Large, small Asynchronous I/O
Read Size How large of thedataset will beaccessed
Whole, partial Subfile
Read Sequence Will a sequence ofdata files will beaccessed
Yes, no Asynchronous I/O,superfile
![Page 25: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/25.jpg)
January 17, 2001 Xiaohui Shen 25
Optimization decision Flow
![Page 26: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/26.jpg)
January 17, 2001 Xiaohui Shen 26
Applications and Tools
![Page 27: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/27.jpg)
January 17, 2001 Xiaohui Shen 27
Experimental Environment
Applications: IBM SP2 at Argonne Multiple Storage Resources:
Local Disks: Argonne SP2 Remote Disks: SDSC Remote Tapes: SDSC HPSS Local Database: Postgres at NWU
![Page 28: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/28.jpg)
January 17, 2001 Xiaohui Shen 28
0
50000
100000
150000
200000
1 2 3 4 5 6
I/O Time of Data Analysis
MS-I/O Experiments:Data Analysis on Astrophysics data
No access pattern then Remote Tape
DataPartition=‘BBB’ then Remote Tape + Colletive I/O
WhenUse=‘soon’ & Size =‘ medium’ then Remote Disk
Plus DataPartion=‘BBB” then Remote Disk + Collective I/O
Plus UseFrequency=‘frequent’ then Local Disk
Plus DataPartion=‘BBB” then Local Disk + Collective I/O
![Page 29: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/29.jpg)
January 17, 2001 Xiaohui Shen 29
0
200
400
600
800
1000
1200
1 2 3 4 5 6
Execution Time of Volren
MS-I/O Experiments: Volume Rendering
No Access Pattern then Remote Tape
ComputeTime=‘large’ then Remote Tape + Asyn- I/O
WhenUse=‘soon’ & Size =‘ medium’ then Remote Disk
Plus ComputeTime=‘large’ then Remote Disk + Asyn - I/O
Plus UseFrequency=‘frequent’ then Local Disk
Plus ComputeTime=‘large’ then Local Disk + Asyn - I/O
![Page 30: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/30.jpg)
January 17, 2001 Xiaohui Shen 30
MS-I/O Experiments: Subfile and Superfile
0
20000
40000
60000
80000
Remote Disks Remote Tapes
I/O Time of Subfile
Naive
Subfile
0
10
20
30
40
10 Files 20 Files
I/O Time of Superfile
Naive
Superfile
WriteSize=‘huge’ & FutureReadSize = ‘partial’
WriteSize=‘small’ & WriteSequence=‘y’ & FutureReadSequence=‘y’
![Page 31: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/31.jpg)
January 17, 2001 Xiaohui Shen 31
MS-I/O Experiments: Replication and Access History
0
100
200
300
400
500
600
700
1 2 3 4 5 6
Data Access History
0
100
200
300
400
500
600
700
1 2 3 4 5 6
I/O Time of Data Duplication
Dataset was first placed at Remote site
Read.UseFrequency =‘frequent’
Dataset being frequently used is detected.
![Page 32: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/32.jpg)
January 17, 2001 Xiaohui Shen 32
Part 4: DPFS: A Distributed Parallel File System
Collect idle distributed storage as supplement to native storage of parallel computing systems
Characteristics Distributed Parallel File System Database
![Page 33: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/33.jpg)
January 17, 2001 Xiaohui Shen 33
System Architecture of DPFS
![Page 34: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/34.jpg)
January 17, 2001 Xiaohui Shen 34
Software Architecture of DPFS
Parallelism Concurrency
![Page 35: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/35.jpg)
January 17, 2001 Xiaohui Shen 35
DPFS BSU and File view
A Basic Striping Unit (BSU) is called brick in DPFS. Size is 64K.
![Page 36: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/36.jpg)
January 17, 2001 Xiaohui Shen 36
Striping Methods
Lineal Striping Multi-dimensional Striping Array Striping
![Page 37: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/37.jpg)
January 17, 2001 Xiaohui Shen 37
Lineal Striping
![Page 38: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/38.jpg)
January 17, 2001 Xiaohui Shen 38
Problems of Linear Striping
![Page 39: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/39.jpg)
January 17, 2001 Xiaohui Shen 39
Multi-dimensional Striping
![Page 40: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/40.jpg)
January 17, 2001 Xiaohui Shen 40
Array Striping
![Page 41: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/41.jpg)
January 17, 2001 Xiaohui Shen 41
Striping Algorithms
Round - Robin Greedy Algorithm
![Page 42: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/42.jpg)
January 17, 2001 Xiaohui Shen 42
Request Combination
P0: 0-7 P1:8-15 P2:16-23 P3:24-31
P0(0,4) P1(9,13) P2(18,22)P3(27,31)
P0(1,5) P1(10,14) P2(19,23) P3(24,28)
...
![Page 43: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/43.jpg)
January 17, 2001 Xiaohui Shen 43
Meta-data and Database
![Page 44: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/44.jpg)
January 17, 2001 Xiaohui Shen 44
Tree Structure
![Page 45: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/45.jpg)
January 17, 2001 Xiaohui Shen 45
Application Programming Interface
DPFS-Open () DPFS-Write () DPFS-Read () DPFS-Close ()
![Page 46: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/46.jpg)
January 17, 2001 Xiaohui Shen 46
User Interface File system commands: cp, mkdir,
rm, ls etc File transfer between DPFS and
general sequential file system. Example: cp local:my.data DPFS:/home/xhshen:4:greedy
![Page 47: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/47.jpg)
January 17, 2001 Xiaohui Shen 47
Experimental Environment
Compute Resource: Argonne IBM SP2
Storage Resources: Class 1: Argonne Linux machines
(Fast Ethernet and ATM) Class 2: NWU Workstations (155M
ATM) Class 3: NWU Workstations (10 M
Eithernet)
![Page 48: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/48.jpg)
January 17, 2001 Xiaohui Shen 48
DPFS Performance Numbers: File Level Comparison
0
5
10
15
Class 3 Class 2 Class 1
File Level Comparisons8 compute nodes, 4 I/O nodes
Linear
CombinedLinear
Multi-dim
CombinedMulti-dim
Array
CombinedArray
0
5
10
15
20
Class 3 Class 2 Class 1
File Level Comparisons16 compute nodes, 8 I/O nodes
Linear
CombinedLinear
Multi-dim
CombinedMulti-dim
Array
CombinedArray
![Page 49: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/49.jpg)
January 17, 2001 Xiaohui Shen 49
DPFS Performance Numbers: Striping Algorithm Comparison
0
1
2
3
4
5
6
Round-robin Greedy
Striping Algorithm Comparison(8 compute nodes, 8 I/O nodes)
Write
Combined Write
Read
Combined Read
0
2
4
6
8
10
Round-robin Greedy
Striping Algorithm Comparison(16 compute nodes, 16 I/O nodes)
Write
Combined Write
Read
Combined Read
![Page 50: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/50.jpg)
January 17, 2001 Xiaohui Shen 50
Part 5: I/O Performance Prediction and Evaluation
Performance Model Performance Prediction Algorithm
![Page 51: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/51.jpg)
January 17, 2001 Xiaohui Shen 51
Performance Model
T(s) = Tconn + Topen + Tseek + Tread/write(s) + Tfileclose + Tconnclose
![Page 52: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/52.jpg)
January 17, 2001 Xiaohui Shen 52
Performance Prediction Algorithm
M: number of datasets N: total number of iterations freq(j): I/O frequency n(j): number of I/O calls tj(s): data transfer time (stored in
database)
M
j
jprediction stjnjfreqNT1
)()()1)(/(
![Page 53: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/53.jpg)
January 17, 2001 Xiaohui Shen 53
Part 6: Integrated Java Graphical User Interface
![Page 54: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/54.jpg)
January 17, 2001 Xiaohui Shen 54
Functions of IJ-GUI
Registering new applications Running applications remotely
![Page 55: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/55.jpg)
January 17, 2001 Xiaohui Shen 55
Functions of IJ-GUI
Data analysis and visualization Table browsing and searching
![Page 56: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/56.jpg)
January 17, 2001 Xiaohui Shen 56
Functions of IJ-GUI
Automatic code generator
![Page 57: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/57.jpg)
January 17, 2001 Xiaohui Shen 57
Functions of IJ-GUI I/O performance prediction
![Page 58: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/58.jpg)
January 17, 2001 Xiaohui Shen 58
I/O Latency Reducing for Interactive Visualization
I/O Latency Reducing
0
50
100
150
200
250
300
350
400
450
1 2 3 4 5 6 7 8 9 10
Tim
e (s
)
Visualization TimeI/O Time
![Page 59: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/59.jpg)
January 17, 2001 Xiaohui Shen 59
Summary of Contributions
Meta-data Management System Remote Storage Access Optimizations Multi-Storage I/O System Distributed Parallel File System I/O performance prediction and evaluation Integrated working environment
![Page 60: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/60.jpg)
January 17, 2001 Xiaohui Shen 60
Publications A Multi-Storage Resource Architecture and I/O Performance Prediction for
Scientific Computing. by X. Shen and A. Choudhary. Cluster Computing Journal. A Novel Application Development Environment for Large-Scale Scientific
Computations, by X. Shen, W. Liao, A. Choudhary, et al. ACM ICS2000 Remote I/O Optimization and Evaluation for Tertiary Storage Systems through
Storage Resource Broker, by X. Shen, W. Liao and A. Choudhary. IASTED Applied Informatics, Innsbruck, Austria, 2001.
A Java Graphical User Interface for Large-Scale Scientific Computations in Heterogeneous Systems, by X. Shen, G. Thiruvathukal, W. Liao, A. Choudhary, and A. Singh. HPC-ASIA, May 2000.
Meta-Data Management System for High-Performance Large-Scale Scientific Data Access, by W. Liao, X. Shen, A. Choudhary. HiPC 2000.
Data management for large-scale scientific computations in high performance distributed systems, by A. Choudhary, M. Kandemir, H. Nagesh, J. No, X. Shen, V. Taylor, S. More, and R. Thakur. In Proc. HPDC-99
A Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing. by Xiaohui Shen and Alok Choudhary. HPDC-00
A Distributed Multi-Storage I/O System for High Performance Data Intensive Computing, by Xiaohui Shen and Alok Choudhary.
DPFS: A Distributed Parallel File System, by Xiaohui Shen and Alok Choudhary. An Integrated Graphical User Interface for High Performance Distributed Computing, by Xiaohui Shen, Wei-keng Liao and Alok Choudhary
An Integrated Graphical User Interface for High Performance Distributed Computing, by Xiaohui Shen, Wei-keng Liao and Alok Choudhary
A Multimedia Integrated Parallel File System, by J. Carretero, W. Zhu, X. Shen, A. Choudhary. JCIS98.
![Page 61: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/61.jpg)
January 17, 2001 Xiaohui Shen 61
Future Directions-1
![Page 62: Data Management, Storage and Access Optimization in High Performance Distributed Environment](https://reader036.fdocuments.in/reader036/viewer/2022081504/56814cd4550346895db9d9dc/html5/thumbnails/62.jpg)
January 17, 2001 Xiaohui Shen 62
Future Directions-2