Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie...

17
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie 1 Rice University Rice University HUST Presentation by Qi Huang (HUST, Cornell) USA USA China

Transcript of Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie...

Middleware Enabled Data Sharing on Cloud Storage Services

Jianzong Wang Peter Varman Changsheng Xie

1

Rice University Rice University HUST

Presentation by Qi Huang (HUST, Cornell)

USA USA China

Cloud Storage Overview

• Large deployment -> Global accessibility• Cost efficient• Featured functions (Protection/Backup)

2

Storage service hosted in Cloud

Internet CloudStorage

• Easy access• Cheap cost• Back-up protection

• Provides large capacity• Sustains for long-term usage• Achieves high throughput• Provides sharing capability 3

Wish list of Storage System

More…

Amazon: Existing Solutions

4

Simple Storage Service (S3) large capacity, high availability, sharable, but performance is various

Elastic Block Storage (EBS) moderate capacity, high performance with small variation, but not sharable

Solution

• A middleware layer– Provides a scalable storage service to clients– Allows storage volumes to be shared among multiple

clients running simultaneously on multiple VMs– Provides high performance and less variation of this

performance

5

Basic Ideas

• mCloud, – Combines S3, EBS and ELD to serve storage– Enables sharing of multiple EBS volumes among

multiple EC2 VMs– Supports fast and transparent data migration between

S3 and EBS– Incorporates others strategies to improve performance

• Layered cache• Data chunking

• Fair IO scheduling

6

Contributions

• Give a fresh way to identify and address the problems in performance in cloud storage.

• Various topological structures for data sharing on clouds

have been investigated in mCloud using data-intensive

applications and benchmarks.

• We show potential schemas, for instance data placement, data chunking, and IO scheduling strategies, that can be integrated into mCloud to provide performance SLAs for cloud storage services.

7

Talk Overview

• System Architecture

• Data Sharing Approaches

• Evaluations

• Conclusion and Future Work

8

System Architecture

9

A Simple Sharing Method

Limitations:1. Data transfer to and from S3 is slow2. Consumption of EBS grows even further when sharing multiple volumes.

10

Data Sharing Approach

11

Improvements:1. Sharing happens at ELD and EBS, performs better2. Consumption of EBS grows only with more storage

Evaluations: Basic Performance Testbed Configuration

12

Takeaways:1. Performance is stable till EBS level (among EC2)2. Out of EC2, throughput becomes unstable and bad

Evaluations: Scaling number of EBS/EC2 for single size file

13

Total throughput scales with EBS/EC2

14

Evaluations: Scaling number of EBS/EC2 for application file settings

Scaling writes perform better than read

Evaluations: Chunking Performance

15

Average throughput increases along with number of chunks

Conclusions and Discussion

• Hybrid Cloud Storage Architecture– How to group the optimization architecture to provide

better storage services. And import the DHT (Distribute Hash Table) to maintain the metadata.

• IO Scheduling– The switcher can control the IO to make the system

load balance and avoid the performance burst.

• Optimization Cloud Storage Medium– Key-Value design may not the best one. It is possible

to bring out the new ones.16

Thank You

Please be free to email:

[email protected]

17