Storage Virtualization Introduction

61
Storage Virtualization Seminar Stephen Foskett Director of Data Practice, Contoural

Transcript of Storage Virtualization Introduction

Page 1: Storage Virtualization Introduction

Storage Virtualization Seminar

Stephen FoskettDirector of Data Practice, Contoural

Page 2: Storage Virtualization Introduction

Part 1:Breaking the Connections

Storage virtualization is here, breaking the connection between physical storage infrastructure and the logical way we use it

Page 3: Storage Virtualization Introduction

Agenda

What is storage virtualization?

Volume management

Advanced file systems

Virtualizing the SAN

Virtual NAS

Page 4: Storage Virtualization Introduction

Poll: Who is Already Using Storage Virtualization?

We talk about virtualization like it is new or strange…

…but your storage is already virtualized!• Disk drives map blocks

• RAID is as old as storage (conceived 1978-1988)

• Modern OSes include volume management and path management

• Network-attached storage (NAS) redirectors and DFS

• Storage arrays are highly virtualized (clustering, LUN carving,

relocation, tiering, etc…)

According to ESG, 52% have already implemented

storage virtualization and 48% plan to! (ESG 2008)

Page 5: Storage Virtualization Introduction

1. The act of abstracting, hiding, or isolating the internal

function of a storage (sub)system or service from

applications, compute servers or general network

resources for the purpose of enabling application and

network independent management of storage or data.

2. The application of virtualization to storage services or

devices for the purpose of aggregating, hiding complexity

or adding new capabilities to lower level storage resources.

Storage can be virtualized simultaneously in multiple

layers of a system, for instance to create HSM like systems.

SNIA Defines Storage Virtualization

1. The act of abstracting, hiding, or isolating the internal

function of a storage (sub)system or service from

applications, compute servers or general network

resources for the purpose of enabling application and

network independent management of storage or data.

2. The application of virtualization to storage services or

devices for the purpose of aggregating, hiding complexity

or adding new capabilities to lower level storage resources.

Storage can be virtualized simultaneously in multiple

layers of a system, for instance to create HSM like systems.

Page 6: Storage Virtualization Introduction

What and Why?

Virtualization removes the hard connection

between storage hardware and users• Address space is mapped to logical rather than physical

locations

• The virtualizing service consistently maintains this meta-data

• I/O can be redirected to a new physical location

We gain by virtualizing• Efficiency, flexibility, and scalability

• Stability, availability, and recoverability

Page 7: Storage Virtualization Introduction

The Non-Revolution:Storage Virtualization

We’ve been talking about storage

virtualization for 15 years!

Virtualization exists for both block

and file storage networks

Can be located in server-based

software, on network-based

appliances, SAN switches, or

integrated with the storage array

Software

SwitchAppliance

Array

Page 8: Storage Virtualization Introduction

Introducing Volume Management

Volume management = server-based storage virtualization

Volume managers abstract block storage (LUNs, disks, partitions)

into virtual “volumes”

Very common – all* modern OSes have volume managers built in• Windows Logical Disk Manager, Linux LVM/EVMS, AIX LVM, HP-UX LVM, Solaris Solstice,

Veritas Volume Manager

Mostly used for flexibility• Resize volumes

• Protect data (RAID)

• Add capacity (concatenate or expand stripe or RAID)

• Mirror, snapshot, replicate

• Migrate data

Page 9: Storage Virtualization Introduction

Logical Volume Managers

Platform Volume Manager   Notes

AIX Logical Volume Manager OSF LVM, no RAID 5, no copy-on-write snapshots

HP-UX 9.0+ HP Logical Volume Manager OSF LVM, no RAID 5

FreeBSD Vinum Volume Manager No copy-on-write snapshots

Linux 2.2+Logical Volume Manager and Enterprise Volume Management System

Based on OSF LVM, no RAID 5

SolarisSolaris Volume Manager (was Solstice DiskSuite)

Limited allocation options, no copy-on-write snapshots

AIX, HP-UX, Linux, Solaris, Windows

Symantec Veritas Volume Manager (VxVM), Storage Foundation

Full-featured multi-platform volume manager

Windows 2000+ Logical Disk ManagerCo-developed with Veritas, limited allocation options, copy-on-write snapshots introduced in Server 2003

Solaris, BSD, Mac OS X 10.6+

ZFS Combined filesystem and volume manager

Page 10: Storage Virtualization Introduction

ZFS: Super File System!

ZFS (originally “zettabyte file system”) is a combined file system,

volume manager, disk/partition manager• Open source (CDDL) project managed by Sun

• Will probably replace UFS (Sun), HFS+ (Apple OS X Snow Leopard Server)

ZFS creates a truly flexible, extensible, and full-featured pool of

storage across systems and disks• Filesystems contained in “zpools” on “vdevs” with striping and optional RAID-Z/Z2

• 128-bit addresses mean near-infinite capacity (in theory)

• Blocks are “copy-on-write” with checksums for snapshots, clones, authentication

…but there are some limitations• Adding (and especially removing) vdevs is hard/impossible

• Stacked RAID is impossible

• There is no clustering (until Sun adds Lustre)

Page 11: Storage Virtualization Introduction

Path Management Software

Path management virtualizes the connection from a

server to a storage system• Failover

• Load balancing strategies

A few choices• Veritas DMP (cross-platform, with Storage Foundation)

• EMC PowerPath (supports EMC, HDS, IBM, HP)

• IBM SDD (free for IBM)

• HDS (HDLM)

• Microsoft MPIO (Windows, supports iSCSI and most FC)

• VMware Failover Paths

Page 12: Storage Virtualization Introduction

Virtualizing the SAN

The storage area network (SAN) is a popular

location for virtualizationCan require less reconfiguration and server work

Works with all servers and storage (potentially)

Resides on appliance or switch placed in the

storage networkSome are in the data path, others are less so

Brocade and Cisco switches have application blades

Some use dedicated storage services modules (SSMs)

Page 13: Storage Virtualization Introduction

In-Band vs. Out-of-Band

In-band devices intercept traffic Out-of-band devices redirect traffic

Where’s my data?

I got yer data right

here!

Where’s my data?It’s over

there!

Page 14: Storage Virtualization Introduction

SAN Virtualization Products

Product Architecture LocationThin Prov.

Repl. Notes

DataCore SANsymphony In-band IP Generic x86 Yes YesSupports SCSI, FC, ATA drives over IP on a Wintel server

EMC Invista Out-of-band FC FC switch No NoNo caching; supports Cisco & Brocade FC switch blades and SSPs

FalconStor IPStor NSSIn- or out-of-band IP

Generic x86 No YesBlock-based; supports a variety of drive types on a Wintel server or Cisco FC blade

IBM SVC In-band FC Appliance No YesSupports most FC storage; large caches; IBM hardware

Incipient iNSP Out-of-band FC FC switch No NoNo caching; supports Cisco FC blades

LSI StoreAge SVM Out-of-band FCAppliance & host SW

No YesNo caching; split-path FC with proprietary SSM

Reldata Unified Storage In-Band Appliance No Yes NAS and IP SAN

Sanrad V-Switch In-band IP Appliance No Yes Bridges FC to iSCSI

Page 15: Storage Virtualization Introduction

Virtual NAS

File-based network-attached storage (NAS) lends

itself to virtualization• IP network connectivity and host processing possibilities

Multitude of file servers? Virtualize!• Global namespace across all NAS and servers

• Share excess capacity

• Transparently migrate data (easier than redirecting users!)

• Tier files on large “shares” with variety of data

• Create multiple virtual file servers

Page 16: Storage Virtualization Introduction

NAS Virtualization ProductsProduct Architecture Location Notes

Attune Maestro In-band ApplianceWindows-focused with replication and snapshots

BlueArc In-band Clustered NASClustered integrated NAS with global namespace

Brocade FME In-band Appliance DFS/CIFS initially with NFS in the works

Brocade StorageX Out-of-band Host SWDFS and NIS; also does data migration; also NetApp VFM and HDS

Data Domain In-band Appliance or host SWDeduplication NAS/VTL/OST target for block storage

EMC Rainfinity In-band Appliance or host SW DFS management

F5 Acopia In-band Appliance Split-path architecture, non-DFS

Microsoft DFS Out-of-band Host SWWindows/SMB only; Server 2003 R2+ enhanced management

Network Appliance vFiler In-band Clustered NASClustered NAS “head” with global namespace

ONStor GNS In-band Clustered NAS & DFSCombines clustered NAS with DFS into a single global namespace

Reldata Unified Storage In-Band Clustered NAS NAS and IP SAN

Page 17: Storage Virtualization Introduction

Transformed Storage Systems

Virtualization technology is common in storage array

controllers• Arrays create large RAID sets and “carve out” virtual LUNs for use by

servers

• Controller clusters (and grids) redirect activity based on workload and

availability

• Snapshots/mirrors and replication are common features

A new generation arrays with virtualization features is

appearing, with tiered storage, thin provisioning,

migration, de-duplicationSub-disk RAID = the end of RAID as we know it?

Page 18: Storage Virtualization Introduction

Virtual Tiered Storage

Array controllers can transparently move data from

low-cost to high-performance disk

Most arrays support multiple drive types• “Bulk” SATA or SAS drives are common (500 GB - 1 TB)

• Solid-state drives are the latest innovation

Some arrays can dynamically load balance

A few can “hide” other arrays “behind”SAN: HDS USP-V and similar from Sun, HP

NAS: Network Appliance vFiler, ONStor Bobcat

Page 19: Storage Virtualization Introduction

Thin Provisioning

Storage is commonly over-allocated to servers

Some arrays can “thinly” provision just the capacity that actually

contains data• 500 GB request for new project, but only 2 GB of initial data is written – array only

allocates 2 GB and expands as data is written

• Symantec API, thin-unprovisioning capabilities

What’s not to love?• Oops – we provisioned a petabyte and ran out of storage

• Chunk sizes and formatting conflicts

• Can it thin unprovision?

• Can it replicate to and from thin provisioned volumes?

Thin provisioning is an abdication of our responsibilities!

Page 20: Storage Virtualization Introduction

De-Duplication

The next frontier – efficiently storing duplicate content• More appropriate to some applications than others

Software or appliance (and now array!) analyzes files or

blocks, saving duplicates just once• Block-based reduce capacity more by looking inside files

• Once common only for archives, now available for production data

Serious implications for performance and capacity

utilization• In-line devices process all data before it is written

• Post-processing systems scan written data for duplicates

Page 21: Storage Virtualization Introduction

“Cloud” Storage

Many companies are choosing managed

services for servers and storage

Lots of managed archive and backup providers• Zantaz, Google Postini, EMC Mozy, Symantec SPN, etc

Managed storage services is coming into its

own (finally!)• Amazon S3 and Nirvanix

• EMC “Fortress”

Page 22: Storage Virtualization Introduction

The Next-Generation Data Center

Virtualization of server and storage will

transform the data center• Clusters of capability host virtual servers

• Cradle to grave integrated management

SAN/network convergence is next• InfiniBand offers converged virtual connectivity today

• iSCSI and FCoE become datacenter Ethernet (DCE) with

converged network adapters (CNAs)

Page 23: Storage Virtualization Introduction

Question?

Audience Response

Page 24: Storage Virtualization Introduction

Break sponsored by

Page 25: Storage Virtualization Introduction

Part 2:Storage in the Virtual World

Responding to the demands of server, application, and business users with new flexible technologies

Page 26: Storage Virtualization Introduction

Agenda

Why virtual storage for virtual servers?

The real world impact and benefits

Best practices for implementation

Page 27: Storage Virtualization Introduction

Poll: Who Is Using VMware?

VMwareNone

Microsoft Other

Virtualization Users

Page 28: Storage Virtualization Introduction

Poll: Does Server Virtualization Improve Storage Utilization?

Page 29: Storage Virtualization Introduction

Why Use Virtual Storage For Virtual Servers?

1. Mobility of virtual machines

between physical servers for

load balancing

2. Improved disaster recovery

3. Higher availability

4. Enabling physical server

upgrades

5. Operational recovery of

virtual machine images

Page 30: Storage Virtualization Introduction

Server Virtualization = SAN and NAS

Server virtualization has transformed the

data center and storage requirements• VMware is the #1 driver of SAN adoption today!

• 60% of virtual server storage is on SAN or NAS (ESG 2008)

• 86% have implemented some server virtualization (ESG 2008)

Server virtualization has enabled and

demanded centralization and sharing of

storage on arrays like never before!

Page 31: Storage Virtualization Introduction

Three Pillars of VM Performance

Page 32: Storage Virtualization Introduction

Server Virtualization Recoil

Dramatically increased I/O

Patchwork of support, few standards“VMware mode” on storage arrays

Virtual HBA/N_Port ID Virtualization (NPIV)

Everyone is qualifying everyone and jockeying for position

Can be “detrimental” to storage utilization

Befuddled traditional backup, replication,

reporting

Page 33: Storage Virtualization Introduction

VMware Storage Options:Shared Storage

Shared storage - the common/ workstation approach• Stores VMDK image in VMFS

datastores

• DAS or FC/iSCSI SAN

• Hyper-V VHD is similar

Why?• Traditional, familiar, common (~90%)

• Prime features (Storage VMotion, etc)

• Multipathing, load balancing, failover*

But…• Overhead of two storage stacks (5-8%)

• Harder to leverage storage features

• Often shares storage LUN and queue

• Difficult storage management

VMHost

GuestOS

DAS or SANStorage

VMFS VMDK

Page 34: Storage Virtualization Introduction

VMware Storage Options:Shared Storage on NFS

Shared storage on NFS – skip VMFS and use NAS• NTFS is the datastore

Wow!• Simple – no SAN

• Multiple queues

• Flexible (on-the-fly changes)

• Simple snap and replicate*

• Enables full Vmotion

• Use fixed LACP for trunking

But…• Less familiar (3.0+)

• CPU load questions

• Default limited to 8 NFS datastores

• Will multi-VMDK snaps be consistent?

VMHost

GuestOS

NFSStorage

VMDK

Page 35: Storage Virtualization Introduction

VMware Storage Options:Raw Device Mapping (RDM)

Raw device mapping (RDM) - guest VM’s access storage directly over iSCSI or FC• VM’s can even boot from raw devices

• Hyper-V pass-through LUN is similar

Great!• Per-server queues for performance

• Easier measurement

• The only method for clustering

But…• Tricky VMotion and DRS

• No storage VMotion

• More management overhead

• Limited to 256 LUNs per data center

VMHost

GuestOS

SAN Storage

Mapping File

I/O

Page 36: Storage Virtualization Introduction

Physical vs. Virtual RDM

Virtual Compatibility Mode

Appears the same as a

VMDK on VMFS

Retains file locking for

clustering

Allows VM snapshots,

clones, VMotion

Retains same

characteristics if storage is

moved

Physical Compatibility Mode

Appears as a LUN on a

“hard” host

Allows V-to-P clustering,a

VMware locking

No VM snapshots, VCB,

VMotion

All characteristics and SCSI

commands (except “Report

LUN”) are passed through –

required for some SAN

management software

Page 37: Storage Virtualization Introduction

Physical vs. Virtual RDM

Page 38: Storage Virtualization Introduction

Poll: Which VMware Storage Method Performs Best?

Mixed Random I/O CPU Cost Per I/O

Source: “Performance Characterization of VMFS and RDM Using a SAN”, VMware Inc., 2008

VMFS,RDM (p), or RDM (v)

Page 39: Storage Virtualization Introduction

Which Storage Protocol is For You?

FC, iSCSI, NFS all work well

• Most production VM data is on FC

• Either/or? - 50% use a combination (ESG 2008)

• Leverage what you have and are familiar with

For IP storage

• Use TOE cards/iSCSI HBAs

• Use a separate network or VLAN

• Is your switch backplane fast?

• No VM Cluster support with iSCSI*

For FC storage

• 4 Gb FC is awesome for VM’s

• Get NPIV (if you can)

Fibre Channel47%

Direct-At-tached Storage (DAS)24%

Network-At-tached Storage (NAS)22%

iSCSI SAN7%

Storage Protocols for Server Virtualization

(IDC, Dec 2007)

Page 40: Storage Virtualization Introduction

Poll: Which Storage Protocol Performs Best?

Throughput by I/O Size CPU Cost Per I/O

Source: “Comparison of Storage Protocol Performance”, VMware Inc., 2008

Fibre Channel,NFS,iSCSI (sw),iSCSI (TOE)

Page 41: Storage Virtualization Introduction

Storage Configuration Best Practices

Separate operating system and application data• OS volumes (C: or /) on a different VMFS or LUN from applications (D: etc)

• Heavy apps get their own VMFS or raw LUN(s)

Optimize storage by application• Consider different tiers or RAID levels for OS, data, transaction logs -

automated tiering can help

• No more than one VMFS per LUN

• Less than 16 production ESX .VMDKs per VMFS

Get thin• Deduplication can have a huge impact on VMDKs created from a template!

• Thin provisioning can be very useful – Thin disk is in Server, not ESX!?!

Page 42: Storage Virtualization Introduction

Why NPIV Matters

N_Port ID Virtualization (NPIV) gives

each server a unique WWN

• Easier to move and clone* virtual servers

• Better handling of fabric login

• Virtual servers can have their own LUNs,

QoS, and zoning

• Just like a real server!

When looking at NPIV, consider:

• How many virtual WWNs does it support?

T11 spec says “up to 256”

• OS, virtualization software, HBA, FC

switch, and array support and licensing

• Can’t upgrade some old hardware for

NPIV, especially HBAs

Virtual Server

Virtual Server

Virtual Server

21:00:00:e0:8b:05:05:04

Without NPIV

Virtual Server

Virtual Server

Virtual Server

…05:05:05

With NPIV

…05:05:06 …05:05:07

Page 43: Storage Virtualization Introduction

Virtualization-Enabled Disaster Recovery

DR is a prime beneficiary of server and storage

virtualization• Fewer remote machines idling

• No need for identical equipment

• Quicker recovery (RTO) through preparation and automation

Who’s doing it?• 26% are replicating server images, an additional 39% plan to (ESG 2008)

• Half have never used replication before (ESG 2008)

News: VMware Site Recovery Manager (SRM) integrates

storage replication with DR

Page 44: Storage Virtualization Introduction

Enhancing Virtual Servers with Storage Virtualization

Mobility of server and storage images enhances load balancing,

availability, and maintenance• SAN and NAS arrays can snap and replicate server images

• VMotion moves the server, Storage VMotion (new in 3.5) moves the storage

between shared storage locations

Virtualization-optimized storage• Pillar and HDS claim to tweak allocation per VM

• Many vendors announcing compatibility with VMware SRM

• Most new arrays are NPIV-capable

Virtual storage appliances• LeftHand VSA – A virtual virtualized storage array

• FalconStor CDP – a virtual CDP system

Page 45: Storage Virtualization Introduction

Enabling Virtual Backup

Virtual servers cause havoc for traditional client/server

backups• I/O crunch as schedules kick off – load is consolidated instead of balanced

• Difficult to manage and administer (or even comprehend!)

Storage virtualization can help• Add disk to handle the load (VTL)

• Switch to alternative mechanisms (snapshots, CDP)

Consider VMware consolidated backup (VCB)• Snapshot-based backup of shared VMware storage

• Block-based backup of all VMDKs on a physical server

Page 46: Storage Virtualization Introduction

Question?

Audience Response

Page 47: Storage Virtualization Introduction

Break sponsored by

Page 48: Storage Virtualization Introduction

Part 3:Should You Virtualize?

A look at the practical benefits of virtualized storage

Page 49: Storage Virtualization Introduction

Agenda

Pooling for efficiency, flexibility, and scalability

Performance

Stability, availability, and recoverability

The down side

Cost benefit analysis

Where will you virtualize?

Page 50: Storage Virtualization Introduction

Pooling:Flexibility, and Scalability

Effective allocation of resources• The right amount of storage for the application

• The right type (tiered storage)

Quickly add and remove on demand

Move storage from device to another• Tiering, expansion, retirement

Larger systems have fewer capacity limitations

Page 51: Storage Virtualization Introduction

How Green Am I?

Server virtualization can dramatically reduce power,

cooling, and space requirements• Fewer physical servers

• Better (any) power management

Storage virtualization offers fewer green benefits• Does not normally reduce equipment footprint

• Enterprise storage systems not very energy efficient

Transformed storage systems might help• De-duplication, tiered storage, and archiving can slow growth

• New MAID and spin-down devices offer power/cooling savings

Page 52: Storage Virtualization Introduction

Performance

A battle royale between in- and out-of-band!• In-band virtualization can improve performance with caching

• Out-of-band stays out of the way, relying on caching at the device

level

• Split-path adds scalability to in-band

Large arrays perform better (usually) than lots of

tiny RAIDs or disks• First rule of performance: Spindles

• Second rule of performance: Cache

• Third rule of performance: I/O Bottlenecks

Page 53: Storage Virtualization Introduction

Solid State Drives (and Myths)

The new (old) buzz

• RAM vs. NAND flash vs. disk

• EMC added flash drives to the DMX

(CX?) as “tier-0”, CEO Joe Tucci

claims flash will displace high-end

disk after 2010

• Sun, HP adding flash to the server as

a cache

• Gear6 caches NAS with RAM

But…

• Are they reliable?

• Do they really perform that well?

• Will you be able to use them?

• Is the 10x-30x cost justified?

• Do they really save power?

1 GB Thumb Drive

64 GB MacBook

Air

146 GB Symmetrix

SSD

Capacity (MB) 977 62,500 142,578

Max. Write Rate (MB/s) 1 14 42 115

Min. Endurance (writes) 2 100,000 100,000 1,000,000

Capacity x EnduranceWrite Rate

7,000,000 seconds

149 million seconds

1.24 billion seconds

Minimum Lifespan at Max. Write Rate 81 days 4.7 years 39 years

Notes: 1 – No one writes this fast 24x7 2 – Manufacturers claim 2x to 10x better endurance

Page 54: Storage Virtualization Introduction

Stability, Availability, and Recoverability

Replication creates copies of storage in

other locations• Local replicas (mirrors and snapshots) are usually

frequent and focused on restoring data in daily use

• Remote replicas are used to recover from disasters

Virtualization can ease replication• Single point of configuration and monitoring

• Can support different hardware at each location

Page 55: Storage Virtualization Introduction

We Love It!

Efficiency, scalability, performance,

availability, recoverability, etc…

Without virtualization, none of this can

happen!

Page 56: Storage Virtualization Introduction

The Down Side

Consolidation and centralization creates bigger

baskets for your precious data

Downtime and performance affect more systems

Harder to back out if unsatisfied

Additional complexity and interoperability

concerns

Scalability issues - ever-bigger systems

Page 57: Storage Virtualization Introduction

Implementation Issues

Many virtualization systems require additional

software loaded on servers• Device drivers, path managers, agents, “shims”

Additional maintenance and configuration can offset

“single pane” benefits

Organizational issues can crop up• Virtualization blurs the lines between who owns what

• Future datacenter combines server, storage, network

• What about application?

Page 58: Storage Virtualization Introduction

Cost Benefit Analysis

Benefits Improved utilization

Tiering lowers per-GB cost

Reduced need for

proprietary technologies

Potential reduction of

administrative/ staffing costs

Flexibility boosts IT response

time

Performance boosts

operational efficiency

Costs Additional hardware and

software cost

Added complexity, vendors

Training and daily

management

Reporting and

incomprehensibility

Possible negative

performance impact

Stability and reliability

concerns

Page 59: Storage Virtualization Introduction

Where Will You Virtualize?

Pro Con Best For…

Host volume manager

•Full featured•Proven•Widely available

•Impacts on CPU, I/O, RAM•Can’t benefit across systems

Smaller shops or anyone seeking flexibility, especially with recent storage investment

In-band network

•Consolidates assets•Replication•Caching

•Sticky•High cost•Performance hit?

Large environments with heterogeneous storage looking to consolidate and add flexibility

Out-of-band network

•Consolidates assets•Low impact•Medium cost

•Client impact•Support matrix

Large Windows and other supported technology environments concerned about adding a “choke point”

Storage system

•Performance•Familiarity•Enhanced tiered storage

•Support matrix•High cost•Scalability concerns

Shops standardized on the high-end systems offering this kind of virtualization

Page 60: Storage Virtualization Introduction

Closing Thought:What Is Virtualization Good For?

Virtualization is a technology not a product

What will you get from using it?• Better DR?

• Improved service levels and availability?

• Better performance?

• Shortened provisioning time?

The cost must be justified based on business

benefit, not cool technology

Page 61: Storage Virtualization Introduction

Audience Response

Questions?

Stephen Foskett

Contoural, Inc.

[email protected]

http://blog.fosketts.net