Post on 26-Dec-2015
Middleware Enabled Data Sharing on Cloud Storage Services
Jianzong Wang Peter Varman Changsheng Xie
1
Rice University Rice University HUST
Presentation by Qi Huang (HUST, Cornell)
USA USA China
Cloud Storage Overview
• Large deployment -> Global accessibility• Cost efficient• Featured functions (Protection/Backup)
2
Storage service hosted in Cloud
Internet CloudStorage
• Easy access• Cheap cost• Back-up protection
• Provides large capacity• Sustains for long-term usage• Achieves high throughput• Provides sharing capability 3
Wish list of Storage System
More…
Amazon: Existing Solutions
4
Simple Storage Service (S3) large capacity, high availability, sharable, but performance is various
Elastic Block Storage (EBS) moderate capacity, high performance with small variation, but not sharable
Solution
• A middleware layer– Provides a scalable storage service to clients– Allows storage volumes to be shared among multiple
clients running simultaneously on multiple VMs– Provides high performance and less variation of this
performance
5
Basic Ideas
• mCloud, – Combines S3, EBS and ELD to serve storage– Enables sharing of multiple EBS volumes among
multiple EC2 VMs– Supports fast and transparent data migration between
S3 and EBS– Incorporates others strategies to improve performance
• Layered cache• Data chunking
• Fair IO scheduling
6
Contributions
• Give a fresh way to identify and address the problems in performance in cloud storage.
• Various topological structures for data sharing on clouds
have been investigated in mCloud using data-intensive
applications and benchmarks.
• We show potential schemas, for instance data placement, data chunking, and IO scheduling strategies, that can be integrated into mCloud to provide performance SLAs for cloud storage services.
7
Talk Overview
• System Architecture
• Data Sharing Approaches
• Evaluations
• Conclusion and Future Work
8
A Simple Sharing Method
Limitations:1. Data transfer to and from S3 is slow2. Consumption of EBS grows even further when sharing multiple volumes.
10
Data Sharing Approach
11
Improvements:1. Sharing happens at ELD and EBS, performs better2. Consumption of EBS grows only with more storage
Evaluations: Basic Performance Testbed Configuration
12
Takeaways:1. Performance is stable till EBS level (among EC2)2. Out of EC2, throughput becomes unstable and bad
14
Evaluations: Scaling number of EBS/EC2 for application file settings
Scaling writes perform better than read
Conclusions and Discussion
• Hybrid Cloud Storage Architecture– How to group the optimization architecture to provide
better storage services. And import the DHT (Distribute Hash Table) to maintain the metadata.
• IO Scheduling– The switcher can control the IO to make the system
load balance and avoid the performance burst.
• Optimization Cloud Storage Medium– Key-Value design may not the best one. It is possible
to bring out the new ones.16