Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao...

21
Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang , Hao Jiang , Tao Yang*, Xiaogang Li , Yue Zeng * University of California at Santa Barbara Aliyun.com Inc.

Transcript of Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao...

Page 1: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Multi-level Selective Deduplication for VMSnapshots in Cloud Storage

Wei Zhang*, Hong Tang†, Hao Jiang†, Tao Yang*, Xiaogang Li†, Yue Zeng†

* University of California at Santa Barbara† Aliyun.com Inc.

Page 2: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Motivations

• Virtual machines on the cloud use frequent backup to improve service reliability Used in Alibaba’s Aliyun - the largest public cloud

service in China• High storage demand

Daily backup workload: hundreds of TB @ Aliyun Number of VMs per cluster: 10000+

• Large content duplicates• Limited resource for deduplication

No special hardware or dedicated machines Small CPU& memory footprint

Page 3: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Focus and Related Work

• Previous work Version-based incremental snapshot backup

– Inter-block/VM duplicates are not detected. Chunk-based file duduplication

– High cost for chunk lookup• Focus on

Parallel backup of a large number of virtual disks. – Large files for VM disk images.

Contributions– Cost-constrained solution with very limited computing

resource– Multi-level selective duplicate detection and parallel backup.

Page 4: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Requirements

• Negligible impact on existing cloud service and VM performance

Must minimize CPU and IO bandwidth consumption for backup and deduplication workload

– (e.g. <1% of total resource).

• Fast backup speed Compute backup for 10,000+ users within a few hours

each day during light cloud workload.• Fault tolerance constraint addition of data deduplication should not decrease the

degree of fault tolerance.

Page 5: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Design Considerations

• Design alternatives An external and dedicated backup storage

system.

A decentralized and co-hosted backup system with full deduplication

BackupCloud service

backup

Cloud service

backup

Cloud service. . .

backup

Cloud service

Page 6: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Design Considerations

• Decentralized architecture running on a general purpose cluster• co-hosting both elastic computing and backup

service• Multi-level deduplication Localize backup traffic and exploit data parallelism Increase fault tolerance• Selective deduplication Use minimal resource while still removing most of

redundant content and accomplishing good efficiency

Page 7: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Key Observations

Inner-VM data characteristicsExploit unchanged data to localize

deduplicationCross-VM data characteristics

Small common data dominates duplicates

Zipf-like distribution of VM OS/user dataSeparate consideration of OS and user

data

Page 8: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

VM Snapshot Representation

Data blocks are variable-sized

Segments are fix-sized

Page 9: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Processing Flow of Multi-level Deduplication

Page 10: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Data Processing Steps

• Segment level checkup. Use dirty bitmap to see which segments are modified.

• Block level checkup Divide a segment into variable-sized blocks,

and compare their signatures with the parent snapshot

• Checkup from common dataset (CDS) Identify duplicate chunks from CDS

• Write new snapshot blocks Write new content chunks to stoage.

• Save recipes Save segment meta-data information

Page 11: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Architecture of Multi-level VM snapshot backup

Cluster node

Page 12: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Status& Evaluation

• Prototype system running on Alibaba’s Aliyuan cloud. Based on Xen. 100 nodes and each has 16 cores, 48G memory,

25VMs. Use <150MB per machine for backup&deduplication

• Evaluation data from Aliyuan’s production cluster 41TB. 10 snapshots per VM Segment size: 2MB. Avg. Block size: 4KB

Page 13: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Data Characteristics of the Benchmark

• Each VM uses 40GB storage space on average

• OS and user data disks: each takes ~50% of space

• OS data 7 main stream OS releases: Debian, Ubuntu, Redhat, CentOS, Win2003

32bit, win2003 64 bit and win2008 64 bit.• User data

From 1323 VM users

Page 14: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Impacts of 3-Level Deduplication

Level 1: Segment-level detection within VMLevel 2: Block-level detection within VMLevel 3: Common data block detection across-VM

Page 15: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Impact for Different OS Releases

Page 16: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Separate consideration of OS and user data

Both have Zipf-like data distributionBut popularity growth differs as the cluster size/VM users increase

Page 17: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Commonality among OS releases

1G common OS meta datacovers 70+%

Page 18: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Cumulative coverage of popular user data

Coverage is the summation of covered data block size*frequency

Page 19: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Space saving compared to perfect deduplication as CDS size increases

100G CDS (1GB index) -> 75% of perfect dedup

Page 20: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Impact of dataset-size increase

Page 21: Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

Conclusions

• Contributions: A multi-level selective deduplication scheme among VM

snapshots– Inner-VM deduplication localizes backup and exposes more

parallelism– global deduplication with a small common data set appeared in

OS and data disks

Use less than 0.5% of memory per node to meet a stringent cloud resource requirement -> accomplish 75% of what perfect deduplication does.

• Experiments Achieve 500TB/hour on a 1000-node cloud cluster Reduce bandwidth by 92% -> 40TB/hour