Enterprise Storage Reinvented

of 25 /25
Enterprise Storage Reinvented Stanislav Dzúrik FTSS Storage IBM Slovensko [email protected] The IBM XIV Storage

Embed Size (px)


Enterprise Storage Reinvented. Stanislav Dz úrik FTSS Storage IBM Slovensko stanislav [email protected] The IBM XIV Storage. Interface. Interface. Interface. PERFORMANCE RELIABILITY SCALABILITY. = $ + E. Cache. JBOD. JBOD. Traditional Enterprise Storage Solutions. Scale Up. - PowerPoint PPT Presentation

Transcript of Enterprise Storage Reinvented

  • Enterprise Storage Reinvented

    Stanislav DzrikFTSS StorageIBM [email protected] IBM XIV Storage

  • Traditional Enterprise Storage SolutionsWith this legacy architecture, scalability is achieved by using more powerful (and more expensive) components with higher energy consumption ControllersBuilding blocks:Disks/FlashCacheControllersInterfacesInterconnectsScale Up

  • Traditional Enterprise storage solutionsTraditionally storage is improved by further optimizing the existing concept, e.g.Using faster and more reliable drivesAdding additional cacheManufacture new backplanesAdd new hw/sw layers for virtualization and thin provisioningForklift upgradesThis comes with a price, resulting in high cost, complex solutions and increased power consumptionWe have to look for different ways to achieve our ever growing need for larger, faster, flexible, efficient and more reliable ways to store our dataWe had to reinvent the way we look at storage

  • IBM XIV Storage Architecture A Disruptive GRIDDesign principles:Massive parallelismGranular distributionOff-the-shelf componentsCoupled disk, RAM and CPUUser simplicitySwitchingSwitchingScale Out

  • IBM XIV Enterprise Storage SolutionIBM XIV Storage is based on the following basic principles:The entire system is one Virtual spaceSimple storage provisioning an thin allocationSelf Healing - the failure of a component is automatically fixed with no impact on the reliability and performance of the systemSelf Tuning - the provisioning and management of data should always result in the optimal use of available spaceThe speed of data access is not dependent on the speed of the drives (no disk hotspots)Make use of readily available standard componentsGREEN - Efficient use of resources: Power, Cooling, SpaceBest in class TCO - Cost effective

  • IBM XIV Storage Distribution AlgorithmEach volume is spread across all drives Data is cut into 1MB partitions and stored on the disksXIV algorithm automatically distributes partitions across all disks in the system pseudo-randomly

  • XIV Space Distribution on System ChangesData distribution only changes when the system changesEquilibrium is kept when new hardware is addedEquilibrium is kept when old hardware is removedEquilibrium is kept after a hardware failureData Module 2Data Module 3Data Module 1[ hardware upgrade ]

  • XIV Distribution Algorithm on System ChangesData distribution only changes when the system changesEquilibrium is kept when new hardware is addedEquilibrium is kept when old hardware is removedEquilibrium is kept after a hardware failureData Module 2Data Module 3Data Module 4[ hardware failure ]The fact that distribution is full and automatic ensures that all spindles join the effort of data re-distribution after configuration change.

    Quick recovery from failure. No disk hot spots - all drives are used equally

  • Storage in a gridData is redundantly spare space are spread over all the drives, with parallel access and smart caching to match the performance of high end systemsIf a drive fails, the system replicates the lost data across the other drives. System is fully redundant in less than 30 min with minimal performance impact.Efficient and Green by designSimplified architectureEach volume a spread on all the drivesUse of large SATA DisksIntegrated softwareThin, smart and simple to manageSimple migrationsHow does it work?

  • Making Storage EfficientComparing raw capacity to actual application data capacity in traditional storage system shows an amazing gapSeveral factors contribute:Over-provisioning of spaceBackups, Clones, BCVs and SnapsOrphaned spaceThe XIV Enterprise architecture makes efficient use of the capacity so that you can meet your needs with far less capacityEnables top reliability and high and consistent performance with energy efficient high density components (e.g. SATA)

  • Over-provisioningTraditional systems require upfront allocation of predicted spaceVolume resizing in the storage system is complexVolume resizing is virtually impossible for applicationsInevitable application downtimeThe result:Users tend to pad requirements, leading to over-provisioningExcessive capacity goes unused for months, years or foreverExcessive floor-space, power and cooling is wastedYoure stuck with old systemsCant benefit from expected decline in price and advances in technology

  • XIV Thin Provisioning ConceptVirtualize volume sizeDefine virtual volumes of maximal capacity for each applicationMap them to much less physical storageScale physical storage over time without having to change the virtual volume sizeThe result: Application volumes never need to be resizedNo application downtimeBuy storage over time, just when you actually need itDont pay for over-estimationDont waste floor-space, power, and cooling on unused spaceKeep users happy

  • How Does It Work?The system is bound by two parameters:Hard capacity: The net physical capacity of the systemSoft capacity: The virtual capacity of the entire systemHard capacity is determined by the system sizeSoft capacity is defined by the administratorDefine virtual volumes out of the systems soft sizePool-based thin provisioningThin provisioning is set independently for each storage poolProtection for critical applicationsHard and soft capacity can easily be moved from one pool to anotherMap each virtual volume to a physical capacity taken out of the systems hard size

  • Some Notes About Thin Provisioning Monitoring is crucialMonitor actual capacity usage and how fast it is growingMake sure you add capacity in time or your applications will not be able to write dataActual savings depend onThe application profileAdministration standards of the company

  • Differential Backups and SnapsBackups: Periodic copies of entire volumes Maintain regulation complianceRestore data upon corruption or human errorsTraditional systems use full volume copy for each backupEach copy require allocation of the entire volume capacityThe solution: Differential backups through XIV SnapsSave only deltas from the main volumeNo upfront allocation - Allocate space over time when new data is writtenActual savings depend onNumber of snapshotsApplication profile

  • No Orphaned SpaceIn traditional architectures some capacity is effectively lost over timeThe complexity of volume and performance managementThe ever changing applications and their storage needsThe result: Idle storage chunks are scattered in the systemReclaiming them is more expensive than buying a new systemThe solution: Let the system automate volume allocation and managementMaintaining perfect equilibrium across the system throughout its lifetimeAutomatically handle tasks such as striping, volume resizing, migration etc.The result: No space is ever lost

  • Stretching a TB to the Max

  • How Capacity Efficiency Translates to SavingsServe the same applications with fewer TBs

  • SATA disks to save even more powerThe power consumption of a system is the sum total of the power used by its componentsSince there are so many of them, disks are typically the biggest users SATA vs. FC disksSATA drives provide 2-10 times the capacity Lower spin rate means each disk requires 25-30% less powerThe result: A lot less power is used to drive each raw TB (3 to 15 times less)using SATA drives is not a compromiseThe XIV architecture offers primary storage performance for all volumesPerfectly adapt to any future changes in volumes and capacity

  • System Power UsagePower consumption of a system comparable to XIV is 180380W per raw TBTypically using 146GB 15K rpm disksPower consumption of an XIV rack is 7.7KW 180TB raw capacity, 79TB net capacity42W per raw TB todayRack power consumption will not change much with 2TB disks But capacity will doubleConsumption per raw TB expected to drop to 21W

  • System Power Efficiency

  • Power per TB is only Half the StoryUsing less power to drive a storage system means Direct power cost savingsCooling savings, typically adding 30% - 100% of the system power consumption depending on data centre efficiencySmaller UPS infrastructurePower is becoming a limited resourceIts not just a a matter of paying morePower companies unable to deliver more power to satisfy the expected growth

  • Ease of Management

  • Ease of ManagementSystem administration is virtually effortlessComplex tasks are handled automatically under the hoodAlways optimizedNot prone to human errorsThings an administrator no longer need to worry about:Planning volume layout, Optimizing for performanceTiering data to service levelsEase of provisioning means better service to usersInstant creation of snapshots makes backup procedures simpler

  • MigrationAutomatic data migrationMigrating thick volumes to thin provisioned volumesOnline data migration from other Storage arraysNew hardware can be added to the systemBetter performance, less power, more densityOutdated hardware can be phased out and removed

  • SummaryEnterprise Storage Reinvented

  • Architectural design:Each data module is a storage system on its own, with CPU, cache and disk. Then, multiple storage systems are coupled.Redundancy is achieved by having data modules protect each other.Granular distribution of data across the data modules.Switching bandwidth - includes 2 GigE switches. Each data module has 2 active connections (performance & redundancy), one per switch.1 GB bidirectional between each switch and data module - so effectively reaching 4GB bandwidth for each module.

    IMPACT:Systems performance is excellent due to parallelism and caching (even though individual disks are slower)Distributed module-independent cache eliminates bottlenecks and hot spotsDrives connected to on-board CPU for extreme bandwidth between cache and disksExtreme switching bandwidth allows very aggressive pre-fetching. When connecting multiple racks, a 10GigE Switch is used.Systems scalability is unlimited - as you add disk, you also add cache and CPU

    Systems reliability is above any other system (although individual disks are not more reliable)Key reasons for reliability: Consistent load on drives, eliminates hot-spots Rebuild with many drives Can predict failures better since load is similar Extremely short rebuild time - 750GB in 20 minutes

    **Because of the distribution of data & granularity, THE RESULT IS:1. Better performance - more drives are serving the application, yielding much better performance (both reads and writes), and the elimination of hot spots Better system optimization - because of the granularity you dont have to tune the system constantly

    No other system goes into this granularity.3Par - 256MBEMC Symmetrix - a few GB

    Why is 1MB the ideal size?Small enough to improve performance significantly and create random distribution; will be in cache if used frequentlyLarge enough so you can get an optimal amount of data from the drive in 1 I/O

    CUSTOMER EXAMPLE: One of the leading banks in Israel installed XIVs Nextra and their batch window was cut to 1/3 of the original time - from 6 hours to 2 hours.**Because of the innovative distribution algorithm - performance is consistent:During standard functionalityDuring advanced functionalityDuring redistribution (provisioning, de-commissioning, hw failure)

    When hardware is added, existing drives get balanced too due to the redistribution of dataEven load distribution achieves high drive MTBF and superior failure prediction

    XIV is the only system that is fully automated - other systems require manual intervention or special applications for data distributionOther systems can rebuild - but with performance penalties, lengthy timelines and I/O exposure to a second failure during lengthy rebuildRevolutionary rebuild time (500GB in 15 minutes). Only actual used capacity is replicated, if system is 50% full rebuild for the drive will be half the time.