Data Versioning Systems
description
Transcript of Data Versioning Systems
![Page 1: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/1.jpg)
Data Versioning Systems
Research Proficiency Exam
Ningning Zhu Advisor Tzi-cker Chiueh
Computer Science DepartmentState University Of New York at Stony
BrookFeb 10, 2003
![Page 2: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/2.jpg)
Definitions
Data Object Granularity of Data Object
file, tuple, database table, database logical volume, database, block device
Version of a Data Object A consistent state, a snapshot, a point-in-time image
Data Repository Version Repository
![Page 3: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/3.jpg)
Why need data versioning?
Documentation Versioning Control Human mistakes Malicious attacks Software failure History Study
![Page 4: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/4.jpg)
Data Versioning Vs. Other Techniques
Backup Mirroring Replication Redundancy Perpetual storage
![Page 5: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/5.jpg)
Design Issues
Resource Consumption Storage capacity, CPU Storage bandwidth, network bandwidth
Performance old versions, current object Throughput, latency
Maintenance Effort
![Page 6: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/6.jpg)
Design Options
Who perform ? User, Application, file system, database system, object store,
virtual disks, block-device
Where and what to save? Separate version repository? Full image vs. delta
How? Frequency Scope
![Page 7: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/7.jpg)
Data Versioning Techniques
Save
Represent
Extract
![Page 8: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/8.jpg)
Save: naive approach (1)
![Page 9: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/9.jpg)
Save: Split Mirror (2)
![Page 10: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/10.jpg)
Save: copy-old-while-update-new (3)
![Page 11: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/11.jpg)
Save: keep-old-and-create-new (4)
![Page 12: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/12.jpg)
Represent (1)
Full image Easy to extract, consume more resource
Delta Reference direction reference object Differencing algorithm
Chain of delta and full image
![Page 13: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/13.jpg)
Represent: Chain structure (2)
Forward delta V1, D(1,2), D(2,3), V4, (D4,5), D(5,6), V7
Forward delta with version jumping V1, D(1,2), D(1,3), V4, (D4,5), D(4,6), V7
Reverse delta V1, D(3,2), D(4,3), V4, D(6,5), D(7,6), V7
![Page 14: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/14.jpg)
Represent: differencing algorithm (3)
Insert/Delete (diff) vs. Insert/Copy (bdiff)
Rabin fingerprint Given a sequence of bytes:
SHA-1: Collision free hashing function
MtpptttRFttRF
MtptptptttttRF
tttt
iiiiii mod))((((
mod)()(
))1)1
11
21,...3,2,1
,...3,2,1
![Page 15: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/15.jpg)
XDFS
Drawback of traditional version control
Slow extraction, fragmentation, lack of atomicity support
XDFS A user-level file system with versioning support Separate version labeling with delta compression Effective delta chain Built upon Berkeley DB
![Page 16: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/16.jpg)
Log Structured File System-SpriteLFS
Access assumption: small write Data Structure
Inode Inode map Indirect block Segment summary Segment usage table Superblock (fixed disk location) Checkpoint region (fixed disk location) Directory change log
![Page 17: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/17.jpg)
Research Data Versioning System
File System Elephant Comprehensive Versioning File System
Object-store Self-Secure-Storage-System Oceanstore
Database System Postgres and Fastrek
Storage System Petal and Frangipani
![Page 18: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/18.jpg)
Elephant File System (1)
Retention Policy Keep one
Keep all
Keep safe
Keep landmark (intelligently add landmark)
![Page 19: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/19.jpg)
Elephant File System (2)
Metadata organization
![Page 20: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/20.jpg)
S4: Self-Secure Storage System (1)
Object-store interface Log everything Audit log Efficient metadata logging
![Page 21: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/21.jpg)
S4: Metadata Inefficiency (2)
![Page 22: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/22.jpg)
CVFS: Comprehensive Versioning (1)
Journal based logging vs. Multi-version B-tree
![Page 23: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/23.jpg)
CVFS: Comprehensive Versioning (2)
Journal-based vs. Multi-version B-tree
Assumptions about metadata access
Optimizations: Cleaner: pointers in version repository Both forward delta and reverse delta Checkpointing and clustering Bounded old version access by forcing checkpoint
![Page 24: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/24.jpg)
Oceanstore: decentralized storage
A global-scale persistent storage A deep archival system Data Entity is identified by
<A-GUID, V-GUID>
Internal data structure is similar to S4.
Use B+ tree for object block indexing
![Page 25: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/25.jpg)
Postgres:a multi-version database(1)
Versioning support “Save” of a version in the database context Optimized towards “extract”
Database Structure and Operation Tables made up of tuples First and secondary indices Transaction log: <TID, operation> Update Delete + Insert
![Page 26: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/26.jpg)
Postgres: record structure (2)
Extra fields for versioning: OID : record ID, shared by versions of this
record Xmin : TID of the inserting transaction Tmin : Commit time of Xmin Xmax : TID of the deleting transaction Tmax : Commit time of Xmax PTR : forward pointer from old new
![Page 27: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/27.jpg)
Postgres: Save (3)
![Page 28: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/28.jpg)
Postgres: Represent & Extract (4)
Full image + forward delta SQL query with TIME parameter Build indices using R-tree for ops:
Contained in , overlap with
Secondary indices When a delta record is inserted, if secondary indices
need to be changed, an full image need to be constructed
![Page 29: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/29.jpg)
Postgres: Frequency of extraction (5)
No archive Timestamp never filled in
Light archive Extract time from TIME meta table
Heavy archive First use, extract time from TIME metadata, then fill
the field Later use, directly from data record
![Page 30: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/30.jpg)
Postgres: Hardware Assumption (6)
Another level of archival storage WORM (optical disks)
Optimizations: Indexing Accessing method Query plan Combine indexing at magnetic disks and archival
storage
![Page 31: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/31.jpg)
Fastrek: application of versioning
Built on top of Postgres Tracking read operation Tracking write operation
Tmin, Tmax
Data dependency analysis Fast and intelligent repair
![Page 32: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/32.jpg)
Petal and Frangipani
Petal: a distributed storage supports virtual disk snapshot <virtual disk id, off> -> <physical disk id, off> <virtual disk id, epoch, off> -> <physical disk id,
off>
Frangipani: A distributed file system built on top of Petal Versioning by creating virtual disks snapshot Coarse granularity: mainly for back purpose
![Page 33: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/33.jpg)
Commercial Data Versioning Systems
Network Appliance IBM EMC
![Page 34: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/34.jpg)
Network Appliance: WAFL
Network Appliance Customized for NFS and RAID
Automatic checkpointing Utilize NVRAM:
fast recovery
Good performance: update batching, least blocking upon versioning
Easy extraction: .snapshot directory
![Page 35: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/35.jpg)
WAFL: system layout
![Page 36: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/36.jpg)
WAFL:Limited Versioning
![Page 37: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/37.jpg)
Network Appliance: SnapMirror
Built upon WAFL Synchronous Mirroring Semi-synchronous Mirroring Asynchronous Mirroring
15 minutes interval, save 50% of update
SnapMirror: Get block information from blockmap Schedule mirroring at block-device level
![Page 38: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/38.jpg)
IBM (Flash Copy ESS)
A block-device mirroring system Copy-old-while-update-new Use ESS cache and fast write to
mask write latency Use bitmap to keep track each
block of old version and new version
![Page 39: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/39.jpg)
EMC (TimeFinder)
Split mirror Implementation
![Page 40: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/40.jpg)
Proposal:
Non-point-in-time versioning What is the most valuable state?
Operation-based journaling Natural metadata journaling efficiency
Design Transparent mirroring and versioning Primary site non-journaling, mirror site journaling against intrusion, mistake Applied to network file server
![Page 41: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/41.jpg)
Repairable File Service: architecture
![Page 42: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/42.jpg)
Represent: operation-based
Delta: NFS packets Journal: Reverse delta chain
No checkpointing overhead A chain of 2 months will cost <$100
Efficiency metadata journaling 100-200 bytes for inode, directory update One hash table entry for indirect block update
![Page 43: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/43.jpg)
Save: a hybrid approach
Data block update Copy-old-create-new
Metadata update: Naïve: Read old, write old, update new Variation of Naïve: Guess old,write old, update-new Variation of Naïve: Get old, write old, update-new
![Page 44: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/44.jpg)
User Level Journaling File System
![Page 45: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/45.jpg)
System Layout
![Page 46: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/46.jpg)
Extract: intelligent and fast repair
Dependency logging Dependency analysis Fast Repair
Fast extract of most valuable state of a data system
Drawback: Poor performance for other extract specification
![Page 47: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/47.jpg)
Conclusion (1)
Hardware technology -> DV possible Capacity Random access storage CPU time
Penalty of data loss -> DV a necessity
Data loss System down time
DV technology: Journaling, B+, differencing algorithm
![Page 48: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/48.jpg)
Conclusion (2)
DV at application level DV at file system/database level DV at storage system/block device
level A combined and flexible solution to
satisfy all DV requirement at low cost.
![Page 49: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/49.jpg)
Future Trend (1)
Comprehensive versioning Perpetual versioning High performance versioning
Comparable to non-versioning system
Intrusion oriented versioning Testing new untrusted application Reduce system maintenance cost
Semantic extraction
![Page 50: Data Versioning Systems](https://reader036.fdocuments.net/reader036/viewer/2022081506/56815223550346895dc06871/html5/thumbnails/50.jpg)
Future Trend (2)
In decentralized storage system, integrate and separate DV with
Replication Redundancy Mirroring Encryption
Avoid similar functionality being implemented at by multiple modules