Overview of Redundant Disk Arrays
-
Upload
andrew-robinson -
Category
Technology
-
view
723 -
download
1
description
Transcript of Overview of Redundant Disk Arrays
![Page 1: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/1.jpg)
Redundant Arrays of Inexpensive Disks (RAID)
What a cool idea!
Andrew RobinsonUniversity of Michigan
![Page 2: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/2.jpg)
Authors
• David A Patterson• Garth Gibson• Randy H Katz
Officially published in 1988.
![Page 3: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/3.jpg)
Overview
• What is RAID?• Why bother?• What is RAID, really?• How well does it work?• How’s it holding up?
![Page 4: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/4.jpg)
What is RAID?
• Take a bunch of disks and make them appear as one disk.
• Put data on all of them• Use all at once to gain performance• Duplicate data to gain reliability• Buy cheap disks to gain dollars
![Page 5: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/5.jpg)
why bother?This seems like a lot of work… ???
What is RAID?Why bother?What is RAID, really?How well does it work?How’s it holding up?
![Page 6: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/6.jpg)
Let’s go back to 1987
![Page 7: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/7.jpg)
CPUs and Memory kept getting faster…
• Exponential growth everywhere!• CPU Performance: 1.4X increase per year– More transistors– Better architecture
• Memory Performance: 1.4-2X increase per year– Invention of caches– SRAM technology
![Page 8: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/8.jpg)
… but disks did not.
• It’s hard to make things spin exponentially faster every year (they tend to fly apart).
• Disk seek time improved at a rate of approximately 7% a year.
• Caching had been employed to buffer I/O activity, this works reasonably well for predictable workloads.
![Page 9: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/9.jpg)
Slow I/O Makes Slow Computers
• Amdahl’s Law describes the impact of only improving some pieces, while leaving others.
S – The effective speedupF – Fraction of work in faster modeK – Speedup while in faster mode
![Page 10: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/10.jpg)
…really slow.
• If applications spend 10% of their time in I/O, when computers are 10 times faster, they will only appear 5% faster.
Something needed to be done.
![Page 11: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/11.jpg)
What should we do?
• Single Large Expensive Disks (SLED) are not improving fast enough.
• Larger memory or solid state drives weren’t practical
• Small personal hard drives are emerging… can we do something with those?
![Page 12: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/12.jpg)
Inexpensive Disks Rock
![Page 13: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/13.jpg)
Visual Comparison
![Page 14: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/14.jpg)
Why didn’t someone do this before?
• Standards like SCSI have finally allowed drive makers to integrate features seen in traditional mainframe controllers.
![Page 15: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/15.jpg)
There is a problem…
• A hundredfold increase in number of disks means a hundredfold increase decrease in total reliability
![Page 16: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/16.jpg)
what is RAID, really?that’s all really nice, but ???
What is RAID?Why bother?What is RAID, really?How well does it work?How’s it holding up?
![Page 17: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/17.jpg)
A couple levels… a single idea
• RAID manages the tradeoff between performance and reliability
• RAID comes in levels (RAID1 to RAID5)• These levels represent points in the
performance reliability space
![Page 18: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/18.jpg)
Groups, Disks, and Check Disks
• RAID organizes disks into groups of reliability• Some of the disks in a group store error
correcting data
D = Total disks with dataG = Disks in a groupC = Number of check disks in a group
![Page 19: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/19.jpg)
Metrics
• Useable Storage – Percent of storage that holds data, excluding parity information
• Performance – Tough to make one number:– Reads, Writes, and Read-Modify-Write Access
Patterns– Sequential and Random Data Distribution
![Page 20: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/20.jpg)
RAID1 – The Naive Approach
• Mirroring of all data• To read:– Use either disk
• To write:– Send to both disks
simultaneously
• Minor read performance increase.
![Page 21: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/21.jpg)
Evaluation
Pros• Reads can occur
simultaneously• Seek times can improve
with special controllers• Predictable performance
Cons• Useable storage is cut in
half• All other performance
metrics are left the same
Alright for large sequential jobs and transaction processing jobs
![Page 22: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/22.jpg)
RAID2 – Bit Level Striping
• Uses Hamming Code for Error Detection• Requires many check disks– For 10 data disks, 4 check disks– For 25 data disks, 5 check disks
• Can detect errors, and determine the at-fault disk
![Page 23: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/23.jpg)
RAID2 - Visually
![Page 24: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/24.jpg)
Evaluation
Pros• Better useable storage, 71%
for G=10, 83% for G=25
Cons• Dismal small random data
access performance: 3-9% of RAID1 or SLED
Good for large sequential jobs, bad for transaction processing systems.
![Page 25: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/25.jpg)
RAID3 – Byte Level Striping
• Simpler parity error correction• Only a single check disk required for error
detection• Cannot determine which disk failed, but that’s
usually pretty obvious• Transfers of large continuous blocks is good
![Page 26: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/26.jpg)
RAID3
![Page 27: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/27.jpg)
Evaluation
Pros• Even better useable
storage, 91% for G=10, 96% for G=25
Cons• Small random data access
performance: Just as bad as RAID2
Even better for large sequential jobs, bad for transaction processing systems.
![Page 28: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/28.jpg)
What is parity?
• Parity is calculated as an XOR of the data blocks.
• XOR is reversible:– 1011 (A1) XOR 1100 (A2) => 0111 (AP) “parity”
– 0111 (AP) XOR 1011 (A1) => 1100 (A2)
– 0111 (AP) XOR 1100 (A2) => 1011 (A1)
• This makes error detection and reconstruction possible!
![Page 29: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/29.jpg)
RAID4 - Block Level Striping
• Like RAID3, but more parallelly• Interleave data at sector level rather than bit
level• Allows for servicing of multiple block requests
by different drives• Still keeps all the parity information on a single
drive
![Page 30: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/30.jpg)
RAID4
![Page 31: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/31.jpg)
Evaluation
Pros• Finally better small random
access. Reads are fast!
Cons• Small writes, and read-
write-modifies are still slow.
Good for large sequential jobs, still not great for transaction processing systems.
![Page 32: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/32.jpg)
RAID5 – Block Level Striping with Distributed Parity
• Instead of checksums on a single disk, we distribute them across all disks.
• Allows us to support multiple writes per group
![Page 33: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/33.jpg)
RAID5
![Page 34: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/34.jpg)
Evaluation
Pros• Really good usable storage• Finally decent small random
data access performance across the board!
Cons• Slightly worse write
performance, data must be written to two disks simultaneously
Finally, a system that works well for both applications!
![Page 35: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/35.jpg)
how well does it work?sounds complicated, ???
What is RAID?Why bother?What is RAID, really?How well does it work?How’s it holding up?
![Page 36: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/36.jpg)
As a Whole
• RAID has many different levels that achieve different tradeoffs in reliability and performance
• Almost all of them, for some (or many) use cases will outperform a SLED for the same cost.
![Page 37: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/37.jpg)
Read-Modify-Write Per Disk Performance
![Page 38: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/38.jpg)
how’s it holding up?wow, raid sounds awesome, ???
What is RAID?Why bother?What is RAID, really?How well does it work?How’s it holding up?
![Page 39: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/39.jpg)
Arriving back in 2012 now…
![Page 40: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/40.jpg)
RAID has held up remarkably well
• Data centers around the world use RAID technology.
• The small, inexpensive disk is the de facto standard of storage
• The ideas developed for RAID have been applied to many not-RAID things
![Page 41: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/41.jpg)
Some open questions
• What will become of RAID as new, super fast storage mediums start to become cost effective?
• How does it fit in with massive internet-scale storage farms?
![Page 42: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/42.jpg)
Take Aways
• RAID offers significant advantage over SLED for the same cost– RAID5 offers 10x improvement in performance,
reliability, and power consumption while reducing size of array.
• RAID allows for modular growth (add more disks)• Cost effective option to meet challenge of
exponential growth in processor and memory speeds
![Page 43: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/43.jpg)
References
• “A Case for Redundant Arrays of Inexpensive Disks” by David A Patterson, Garth Gibson, and Randy H Katz
• “RAID: A Personal Recollection of How Storage Became a System” by Randy H Katz
• Slides by David Luo and Ramasubramanian K.• Images generously borrowed from Wikipedia
<http://en.wikipedia.org/wiki/RAID>
![Page 44: Overview of Redundant Disk Arrays](https://reader034.fdocuments.net/reader034/viewer/2022042623/54bc270e4a7959336b8b477a/html5/thumbnails/44.jpg)
Thank you!