1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.
-
Upload
annabelle-janel-conley -
Category
Documents
-
view
214 -
download
0
Transcript of 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.
![Page 1: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/1.jpg)
1
![Page 2: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/2.jpg)
2
Corollary
![Page 3: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/3.jpg)
3
System Overview
![Page 4: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/4.jpg)
Second Key Idea: Specialization
• Think GoogleFS
![Page 5: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/5.jpg)
http://netsyslab.ece.ubc.ca 5
Third idea: Enable cross-layer optimizationsLayered Architectures: High benefits, but …
• TCP/IP
• File System
• Benefits, but…– … limits
information flow across layers.
API
![Page 6: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/6.jpg)
http://netsyslab.ece.ubc.ca 6
Cross-Layer Optimizations
• Examples– IP– Storage systems – ….
• Applications Storage System– Performance– QoS requirements– Consistency requirements
• Applications Storage System– Provide storage-level information to applications
Data Intensive Schedulers:
Notification about data movements
Data Intensive Applications:
Co-usage of files
What’s missing? A vehicle to pass information across layers
![Page 7: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/7.jpg)
http://netsyslab.ece.ubc.ca 7
Traditional Use of Custom Metadata
Application Layer
File System Layer
Storage System Layer
Metadata Manager
File Organization Module
Basic File System
Author=Smithinput.datFile Browser
POSIX API
![Page 8: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/8.jpg)
http://netsyslab.ece.ubc.ca 8HPDC'08
Cross-Layer Communication
Application Layer
File System Layer
Storage System Layer
Metadata Manager
File Organization Module
Basic File System
Replicateinput.dat
3x
input.datmoved from
node1 to node3
OK. Schedule Task on node3
POSIX API
![Page 9: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/9.jpg)
Recap
• Object-based storage• Enable specialization --> performance • Enable cross-layer optimization --> genrality
![Page 10: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/10.jpg)
10
One intended use: A Workflow-Aware Storage
System
![Page 11: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/11.jpg)
11
Workflow Example - ModFTDock
• Protein docking application
Simulates the creation of a complex protein from two known proteins
• Applications
Drugs design
Protein interaction prediction
![Page 12: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/12.jpg)
Platform Example – Argonne BlueGene/P
160K cores
10 Gb/s Switch
Complex
10 Gb/s Switch
Complex
GPFS
24 I/O servers
IO rate: 8GBps = 51KBps / core !!
2.5K IO NodesTorus N
etwork
2.5 GBpsper node3D Torus
850 MBps per 64 nodes
Tree
The central storage is a potential bottleneckUnderused resources
![Page 13: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/13.jpg)
Background – ModFTDock in Argonne BG/P
13
Backend file system (e.g., GPFS, NFS)
Scale: 40960 Compute nodes
File based communication
Large IO volumeWorkflow Runtime
Engine
1.2 M Docking
Tasks
IO rate : 8GBps= 51KBps / core
App. task
Local storage
App. task
Local storage
App. task
Local storage
App. task
Local storage
App. task
Local storage
![Page 14: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/14.jpg)
Intermediate Storage Approach
14Backend file system (e.g., GPFS, NFS)
App. task
Local storage
App. task
Local storage
App. task
Local storage
Intermediate Storage
…
POSIX API
Workflow Runtime
EngineScale: 40960 Compute nodes
Stage In
Stage Out
![Page 15: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/15.jpg)
Usage scenario II:
• Support for deduplication
![Page 16: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/16.jpg)
Stakeholders
• The final clients– Financing agencies ($)
• DoE• NSERC
– Science teams• Development team
– Graduate students (6+)– Undergraduate students, visitors (10+)
• Me
![Page 17: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/17.jpg)
Stakeholders – and their goals
• The final clients– Financing agencies ($)
• DoE• NSERC
– Science teams• Development team
– Graduate students (6+)– Undergraduate students, visitors (10+)
• Me
![Page 18: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/18.jpg)
Requirements
1. Easy to deploy2. Easy to integrate with applications3. Versatility and ability to configure4. Efficiency / high-performance /scalability 5. Ability to support versioning and partially
similar data.
All have big architectural implications
![Page 19: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/19.jpg)
Early architectural decisions
1) Object-based storage - system structure
2.) Network/protocol stack: uniform- Stateless to the degree possible
Application
Chunk_4info
Chunk_3info
Chunk_2info
Chunk_1infoSystem Access
Interface - 1
Donor node - 1
Ext-3 file system
Donor node - 1
Ext-3 file system
ManagerRoot
/project/file_1
Control messages
Data messages
Metadatamessages
![Page 20: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/20.jpg)
Early architectural decisions
3.) FUSE-based implementation - Impact: structure, deployability
4.) Policy to manage tension between code maturity and need to experiment
![Page 21: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/21.jpg)
Mid-way architectural decisions
5.) GeneralIO hack6.) Test-driven design
- integrate 3month projects
![Page 22: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/22.jpg)
Implicit architectural policies
7.) Personnel management: - prioritize ‘fun’ - Flat Team structure - Bottom-up decision making / prioritization:- ‘campaigns’
8.) Align ‘values’
![Page 23: 1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.](https://reader035.fdocuments.in/reader035/viewer/2022062718/56649e7f5503460f94b828c0/html5/thumbnails/23.jpg)
Key architectural decisions
1) Object-based storage 2.) Uniform protocol stack3.) POSIX, FUSE-based implementation, 4.) Policy to manage tension between code maturity and need to experiment5.) GeneralIO hack6.) Test-driven design7.) Personnel management: prioritize ‘fun’ 8.) Align values