Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data...

24
2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved. Data integrity in the Cloud Christoph Hellwig Nebula, Inc

Transcript of Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data...

Page 1: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data integrity in the Cloud

Christoph HellwigNebula, Inc

Page 2: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data Integrity

Data integrity in brief:Writing data reliably to persistent

storageRetrieving the same data again later

Page 3: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Application writes

Application

Disk

OS Cache

Applications only write to the OS cache

Reliable ACKs lostNeed fsync to:

Transfer data to diskGet a reliable ACK

Page 4: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Operating System writes

Operating System

Disk

Disk Cache

Often the OS writes to disk cache only

Reliable ACKs lostNeed a cache flush to:

Transfer data to diskGet reliable ACK

Page 5: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Write granularity

Byte-level writes are not atomicOS caches operate on page granularity

(usually 4096 bytes)Disk drives operate on sectors (usually

512 bytes)Striped RAID operates on much larger

granularities (but pretends not to)

Page 6: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Write granularity issues

Too small writes need read-modify-write cyclesMay cause untouched data to be

corrupted Too large writes can be torn

Half-updated data may be on disk

Page 7: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Reading data

As long as the disk and data is there that's easy, right?Silent data corruption happens (a lot)Disk might fail entirely and go away

Page 8: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data protection

Disks use error correction to deal with bit flipsReads that succeed should return the

right data To protect against disk failure data is

stored in multiple placesN-way mirroring (e.g. RAID1)Erasure encoding (e.g. RAID5/6)

Page 9: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

File System

File System

Multipathing

Volume Manager

Multipathing

File System

Hypervisor

Driver Driver

RAID / SAN

Guest

RAID / SANDisk

Driver

I/O architecture complexity

Personal Computer Enterprise VirtualizationEnterprise Server

Increasing complexity

Page 10: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

I/O architecture complexity

Guest

The Cloud !

GuestGuest Guest

Page 11: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

The Cloud?

Cloud computing in I/O terms:Virtualization as abstractionMassive scale distributed systems (Object Storage)

Page 12: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Multipathing

File System

Hypervisor

Driver

Guest

RAID

Cloud I/O architectures (1)

The “Enterprise” I/O stack Just treats “cloud” as a

management layerScale-out of compute and

disk is handled independently

Page 13: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

File System

Hypervisor

Driver

Guest

Disk

Cloud I/O architectures (2)

“Instance” storageLocal non-fault tolerant

storageUsed for instance-local

temporary dataPart of the AWS and

Openstack models

Page 14: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Distributed storage system

Hypervisor

Guest

Disk

Cloud I/O architectures (3)

Distributed storageAccess by internal

networkingData stored on a

large number of nodes

Hypervisor

Guest

Disk

Driver Driver

Page 15: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Cloud data integrity issues

Mapping between layersComplicated I/O stacksMapping block I/O semantics on file

system semantics Distributed systems

Replication needs to be location-awareFailure is the norm

Page 16: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Without words

Page 17: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

I/O layering issues

Caching semantics need to be preserved over all layers Including mapping them from block

device layers to file systems and backOn distributed systems multiple writers

complicate semantics a lotThis may include live migration

Page 18: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data placement

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Page 19: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data placement

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Good placement?Good placement?

Page 20: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data placement

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Better placement?Better placement?

Page 21: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Data placement

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Rack

Switch

Node

Node

Node

Node

Node

Node

Or maybe this?Or maybe this?

Page 22: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

In-flight data protection

Data might get corrupted in-flight Lots of potentially buggy I/O layers Large clusters → multiple hops through

switches Various protocols offer error checking or

correction E.g. iSCSI crc32c But we'd really like to be able to verify

integrity through the whole I/O stack

Page 23: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

T10 DIF/DIX

End-to-end data protectionStores application checksum in

extended disk sectorsStandard format that can be validated

by all intermediate layers Issues:

Requires disk support, but not part of the ATA standard

Page 24: Data integrity in the Cloud - snia.org · Data Integrity Data integrity in brief: Writing data reliably to persistent storage ... End-to-end data protection

2012 Storage Developer Conference. © Nebula, Inc. All Rights Reserved.

Questions?