AB C D Claim Amount Paid Taxable Settlement the Net Tang ...
1 NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds Henry C. H. Chen Yuchong Hu...
-
Upload
angela-newton -
Category
Documents
-
view
219 -
download
0
Transcript of 1 NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds Henry C. H. Chen Yuchong Hu...
1
NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
Henry C. H. Chen
Yuchong Hu
Patrick P. C. Lee
Yang Tang
IEEE Transactions on Computers, 15 August 2013
2
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codesه NCCloudه Conclusion
3
Introduction
ه Cloud storage provides an on-demand remote backup solution.
ه A single cloud storage provider encounters the problem such as a single point of failure.
4
Introduction
ه The general solution is to distribute data across different cloud providers.ه stripe data
ه The fault-tolerance can be improved by the diversity of multiple clouds.
5
Introduction-Data Failure
ه This paper focuses on unexpected permanent cloud failure.ه a cloud fails permanently => activate repair.ه maintain data redundancy and fault-tolerance.
ه A repair operation ه retrieves data from existing surviving clouds.ه reconstructs the lost data in a new cloud.
6
Introduction-Data Failure
ه During repair, each surviving node ه encode its stored data chunks.ه send the encoded chunks to a new node
ه Regenerate the lost data.
7
Introduction-Cost Problem
ه Today’s cloud storage providers charge users for outbound data.
ه While repairing failures, moving the enormous amount of data (repair traffic) can introduce significant monetary costs.
8
Introduction-Repair Traffic Problem
ه In order to minimize repair traffic problem, regenerating codes [16] have been proposed. ه store data redundantly in a distributed storage
system.ه require less repair traffic, but with the same
fault-tolerance level.
[16] Network Coding for Distributed Storage Systems
9
Introduction-Regenerating Codes
ه But, most existing regenerating codes require storage nodesه equip with computation capabilities.ه perform encoding operations during repair.
10
Introduction-Regenerating Codes
ه In order to make regenerating codes portable to any cloud storage service.
ه This paper considers only a thin-cloud interface where storage nodes only support read/write.
11
Introduction-NCCloud
ه In this paper, we present the design and implementation of NCCloudه a proxy-based storage system.ه a fault-tolerant storage.ه over multiple cloud storage providers.
12
Introduction-FMSR
ه On top of NCCloud, we propose the functional minimum-storage regenerating (FMSR) codes.
ه The FMSR code implementation ه maintain double-fault tolerance.ه maintain the same storage cost as in RAID-6ه less repair traffic when recovering a single-cloud
failure.
13
Introduction-FMSR
ه FMSR codes are non-systematic ه the encoded chunks was formed by linear
combination of the original data chunks.ه not keep the original data chunks as in
systematic coding schemes.
14
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codesه NCCloudه Conclusion
15
Repair in Multiple Cloud Storage
ه Transient failureه is short-term, such that the failed cloud will
return to normal after some time and no outsourced data is lost.
16
Repair in Multiple Cloud Storage
ه Permanent failureه is long-term, in the sense that the outsourced
data on a failed cloud will become permanently unavailable.
ه example : .data center outages in disastersى.data loss and corruptionى.malicious attacksى
17
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codes
ه Motivationه Implementation
ه NCCloudه Conclusion
18
Motivation
ه This paper considersه distributedه multiple-cloud storageه data is stripedه proxy-based design
19
Motivation
20
Fault-tolerant
ه Maximum Distance Separable propertyه (n, k)-MDS code
.divide file into equal-size native chunksى.linearly combined to form code chunksى
ه distribute over n (larger than k) nodes.ه reconstruct original file from any k of the n
nodes.ه tolerate the failures of any n − k nodes.
21
Fault-tolerant
ه The FMSR codes can reconstruct the data of failed node from the surviving nodes.ه download less data.ه not reconstruct the whole file.
22
Different Coding SchemesStorage size 2MRepair traffic M
Storage size 2MRepair traffic 0.75M
Storage size 2MRepair traffic 0.75M
23
Double-fault Tolerant FMSR Codes
ه divide a file M into 2(n − 2) native chunks.ه generate 2n code chunks.ه each node store two code chunks of size .ه repair a failed node, repair traffic is .ه RAID-6 codes, total storage size is , repair traffic
is M.50% saved
24
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codes
ه Motivationه Implementation
ه NCCloudه Conclusion
25
FMSR Codes Implementation
ه FMSR codes do not require lost chunks to be exactly reconstructedه not identical to those in the failed node.
ه As long as the MDS property holds.
26
FMSR Codes Implementation
ه This paper propose a two-phase checking scheme to ensure the code chunks on all nodes always satisfy the MDS property.
27
FMSR Codes Implementation
ه The implementation assumes a thin-cloud interface.1. File upload
2. File download
3. Repair
28
File Upload
ه Native chunks :
ه Code chunks :
ه Encoding matrix of coefficients : ه size ه in the Galois field GF(pn)
29
File Upload
ه Galois field GF(pn)
Encoding coefficient vector
30
File Download
1. Download the k(n−k) code chunks from any k of the n storage nodes.
2. The ECVs of the k(n−k) code chunks can form a k(n−k)×k(n−k) square matrix.
3. Obtain the original k(n − k) native chunks. ه multiply the inverse of the square matrix with the code
chunks.
31
Iterative Repair
ه MDS property must hold even after iterative repairs.
ه This paper proposes a two-phase checking.ه MDS propertyه rMDS property
32
Satisfy MDS, but not rMDS
33
Iterative Repair
Step 1. Download the encoding matrix from a surviving node.
Step 2. Select one ECV from each of the n-1 surviving nodes.
Step 3. Generate a repair matrix .
Step 4. Compute the ECVs for the new code chunks and
reproduce a new encoding matrix.
34
Iterative Repair
Step 5. Given EM’, verify if those properties are satisfied.ه verify MDS by enumerating all .ه verify rMDS by n(n−k)n-1 .ه The corresponding encoding matrices must form a full rank.
Step 6. Download the actual chunk data and regenerate new
chunk data.ه Step 4 : The new ECVsه Code chunks from surviving nodes
35
rMDS Sustaining
36
Time of Two-phase Checking
37
Double-fault Tolerant Codes
ه Markov Model
38
MTTDL, Compare to RAID-6
Mean Time To Data Loss
39
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codesه NCCloudه Conclusion
40
NCCloud
ه A proxy that bridges user applications and multiple clouds.
ه Its design is built on three layers.ه File system layerه Coding layerه Storage layer
41
NCCloud
ه It is mainly implemented in Python, while the coding schemes are implemented in C for better efficiency.
42
Goal of NCCloud
ه Compare the costs and response time of using RAID-6 and FMSR codes.
ه The cost advantage of FMSR over RAID-6, while maintaining acceptable response time.
43
Goal of NCCloud
ه Normal operationsه RAID-6 and FMSR incur similar storage costs.
ه Repair operationه FMSR save a significant amount of transfer
costs over RAID-6.
44
Cost Saving-Price
45
Cost Saving
ه Normal operationsه 1.25PB of data stored
FMSR : $86,851 monthly storage costىRAID-6 : $86,851 monthly storage costى
ه Repair operationه RAID-6 : 1PB of data, $56,832ه FMSR : 0.5625PB of data, $33,894
Saving of $ 22,938
46
Response Time-Local Cloud
47
Response Time-Local Cloud
48
Response Time-Commerical Cloud
49
Outline
ه Introductionه Repair in Multiple Cloud Storageه FMSR Codesه NCCloudه Conclusion
50
Conclusion
ه This paper present NCCloud providing the reliability of today’s cloud backup storage.ه proxy-basedه multiple-cloud storage system
ه NCCloud not only provides fault tolerance in storage, but also allows cost-effective repair.
ه The FMSR code implementation eliminates the encoding requirement of storage nodes during repair.