VPLEX Architecture and Design
-
Upload
muthu-raja -
Category
Documents
-
view
1.083 -
download
11
Transcript of VPLEX Architecture and Design
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 1
© 2010 EMC Corporation. All rights reserved.These materials may not be copied without EMC’s written consent.
Support: Education Services
EMC VPLEX Architecture and Design
April 2010April 2010
Welcome to EMC VPLEX Architecture and Design. Click the play button in the lower right hand corner of this screen to continue.
Copyright © 2010 EMC Corporation. All rights reserved.
These materials may not be copied without EMC's written consent.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
EMC² , EMC, EMC ControlCenter, AdvantEdge, AlphaStor, ApplicationXtender, Avamar, Captiva, Catalog Solution, Celerra, Centera, CentraStar, ClaimPack, ClaimsEditor, ClaimsEditor, Professional, CLARalert, CLARiiON, ClientPak, CodeLink, Connectrix, Co‐StandbyServer, Dantz, Direct Matrix Architecture, DiskXtender, DiskXtender 2000, Document Sciences, Documentum, EmailXaminer, EmailXtender, EmailXtract, enVision, eRoom, Event Explorer, FLARE, FormWare, HighRoad, InputAccel,InputAccelExpress, Invista, ISIS, Max Retriever, Navisphere, NetWorker, nLayers, OpenScale, PixTools, Powerlink, PowerPath, Rainfinity, RepliStor, ResourcePak, Retrospect, RSA, RSA Secured, RSA Security, SecurID, SecurWorld, Smarts, SnapShotServer, SnapView/IP, SRDF, Symmetrix, TimeFinder, VisualSAN, VSAM‐Assist, WebXtender, where information lives, xPression, xPresso, Xtender, Xtender Solutions; and EMC OnCourse, EMC Proven, EMC Snap, EMC Storage Administrator, Acartus, Access Logix, ArchiveXtender, Authentic Problems, Automated Resource Manager, AutoStart, AutoSwap, AVALONidm, C‐Clip, Celerra Replicator, CLARevent, Codebook Correlation Technology, Common Information Model, CopyCross, CopyPoint, DatabaseXtender, Digital Mailroom, Direct Matrix, EDM, E‐Lab, eInput, Enginuity, FarPoint, FirstPass, Fortress, Global File Virtualization, Graphic Visualization, InfoMover, Infoscape, MediaStor, MirrorView, Mozy, MozyEnterprise, MozyHome, MozyPro, NetWin, OnAlert, PowerSnap, QuickScan, RepliCare, SafeLine, SAN Advisor, SAN Copy, SAN Manager, SDMS, SnapImage, SnapSure, SnapView, StorageScope, SupportMate, SymmAPI, SymmEnabler, Symmetrix DMX, UltraFlex, UltraPoint, UltraScale, Viewlets, VisualSRM are trademarks of EMC Corporation.
All other trademarks used herein are the property of their respective owners.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 2
© 2010 EMC Corporation. All rights reserved. VPLEX Architecture and Design 2
Course OverviewCourse Overview
This course is intended for audiences who are presently or planning to be engaged in positioning VPLEX, and performing VPLEX solutions design.
Audience
Upon successful completion of this course, you should be able to:
• Describe VPLEX system architecture and configuration options
• Position solutions utilizing VPLEX, and describe their benefits to the customer
• Describe key VPLEX features, how they can be effectively used, and high‐level tasks for implementing them
• Explain how VPLEX can be integrated into your customer’s production environment
• Perform planning and design for VPLEX deployment
Objectives
EMC believes the information in this course is accurate as of its publication date. It is based on pre‐GA product information, which is subject to change without notice. For the most current information, see the EMC Support Matrix and product release notes in Powerlink.
This course provides detailed coverage of VPLEX in typical data center environments. It comprehensively addresses product architecture, host‐to‐virtual‐storage implementation, system environment sizing, management and monitoring of VPLEX environments.
Description
This course provides an introduction to EMC VPLEX. It describes VPLEX system architecture, key features, and recommended implementations.
This training provides familiarity with major VPLEX solutions design concerns. It also includes a high‐level view of implementation tasks related to specific VPLEX features, functionality and management.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 3
© 2010 EMC Corporation. All rights reserved. VPLEX Architecture and Design 3
Course ModulesCourse Modules
Module 4: Planning and Design Considerations
Module 2: Architecture‐ Physical and Logical Components
Module 3: VPLEX Functionality and Management
Module 1: VPLEX Technology and Positioning
This eLearning course is structured into four modules:
Module 1 briefly covers EMC’s vision on block storage virtualization, and how VPLEX is being positioned.
Module 2 discusses the underlying technology and architecture.
Module 3 covers the major features and capabilities available in the current release.
Module 4 addresses the significant planning and design considerations relevant to VPLEX deployment.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 4
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 4
Module 1: VPLEX Technology and PositioningModule 1: VPLEX Technology and Positioning
Upon successful completion of this module, you should be able to:
• Articulate how VPLEX can enable EMC’s vision of journey to the private cloud
• Describe VPLEX local and distributed federation
• Provide a high‐level system view of VPLEX Local and VPLEX Metro
• Describe typical scenarios where VPLEX technology can be effectively applied
This module introduces fundamental concepts relevant to VPLEX technology, local federation and distributed federation.
The introductory module briefly outlines EMC’s vision on block storage virtualization, and positions VPLEX enabled solutions within the broader context of that vision.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 5
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 5
Journey to the Private CloudJourney to the Private Cloud
Reduce CapEx & OpExLeverage efficiency technologies
Optimize Service LevelsTier and consolidate
InformationInfrastructure
Transitioning to Private Cloud
Deliver “Always On”24 x forever availability
Manage at ScaleSimplify and automate
When EMC thinks of the Private Cloud, its describing a strategy for your infrastructure that enables optimized resource use. This means you're optimized for energy, power and cost savings. You can scale up and out simply and apply automated policies, and you can guarantee greater availability and access for your production environment ‐ significantly reducing or eliminating downtime.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 6
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 6
Efficient
Secure
Always‐on
Automated
On‐Demand
Integrated
FAST + Federation + Storage Virtualization
PhysicalStorage
EMC Vision: Virtual StorageEMC Vision: Virtual Storage
24 x forever – run applications without restart. Ever!
Capabilities that free information from physical storage
Move thousands of VM’s over thousands miles
Batch process in low cost energy locations
Dynamic workload balancing and relocation
Aggregate big data centers from separate ones
For years, users have relied on “physical storage” to meet their information needs. New and evolving changes, such as virtualization and the adoption of Private Cloud computing, have placed new demands on how storage and information is managed.
To meet these new requirements, storage must evolve to deliver capabilities that free information from a physical element to a virtualized resource that is fully automated, integrated within the infrastructure, consumed on demand, cost effective and efficient, always on and secure. The technology enablers needed to deliver this to combine unique EMC capabilities such as FAST, Federation, and storage virtualization.
The result is a next generation Private Cloud infrastructure that allows users to:
Move thousands of VMs over thousands of miles
Batch process in low cost energy locations
Enable boundary‐less workload balancing and relocation
Aggregate big data centers
Deliver “24 x forever” – and run or recover applications without ever having to restart.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 7
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 7
EMC VPLEX ArchitectureEMC VPLEX Architecture
Scale Out Cluster Architecture Start small and grow big with predictable service levels
Advanced Data CachingImprove I/O performance and reduce storage array contention
Distributed Cache CoherenceAutomatic sharing, balancing and failover of storage domains within and across VPLEX Engines
Local & DistributedFederationNext Generation Data
Mobility and Access
AvailableApril 2010
Access Anywhere
EMC and Non‐EMC Arrays EMC and Non‐EMC Arrays
EMC VPLEX is a next generation architecture for data mobility and information access.
It is based on unique technology that combines scale out clustering and advanced data caching, with the unique distributed cache coherence intelligence to deliver radically new and improved approaches to storage management.
This architecture allows data to be accessed and shared between locations over distance via a distributed federation of storage resources.
The first products being introduced based on this architecture include configurations that support local and metro environments, with additional products planned for future releases.
7
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 8
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 8
EMC VPLEX CapabilitiesEMC VPLEX Capabilities
Storage VirtualizationLocal Federation Distributed Federation
Access Anywhere
EMC and non EMC ArraysEMC and non EMC Arrays
Streamline storage refreshes, consolidations and migrations
Simplify multi-array allocation, management, and provisioning
Pool storage capacity to extend useful life for N-1 storage assets
within, across, and between Data Centers over distance
and enable information to be “access anywhere”
and provide “just in time”storage services via scale out
Streamline storage refreshes, consolidations and migrations
Simplify multi‐array allocation, management, and provisioning
Pool storage capacity to extend useful life for N‐1 storage assets
Distributed federation builds on traditional virtualization by adding the ability to transparently move and migrate data within and across data centers. This simplifies multi‐array storage management and multi‐site information access, as well as allows capacity to be pooled and efficiently scaled on demand.
8
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 9
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 9
VPLEX Local: OverviewVPLEX Local: Overview
• Simplify provisioning and volume management
Centralize management of block storage in the data center
Simplify storage provisioning, management and monitoring
Physical storage needs to be provisioned just once ‐ to the virtualization layer
• Non‐disruptive data mobility
Optimize performance, redistribute and balance work loads among arrays
• Workload resiliency
Improve reliability, scale out performance
• Storage pooling
Manage available capacity across multiple frames based on SLAs
VPLEX Local (Single Cluster)
Around 2003, storage virtualization was introduced as a viable solution. The primary value proposition of storage virtualization was moving data non‐disruptively. Customers looked to this technology for transparent tiering, moving back‐end storage data without having to disrupt hosts, simplified operations over multiple frames, as well as ongoing data moves for tech refreshes and lease rollovers.
Customers required tools that enabled storage moves to be made without forcing interaction, and working at the host and database administration levels. This concept of a virtualization controller was introduced and took its place in the market. While EMC released its own version of this with the Invista split‐path architecture, we also continued development on both Symmetrix and CLARiiON to integrate multiple tiers of storage within a single array. Today, we offer Flash, Fibre Channel and SATA within EMC arrays, and a very transparent method of moving data across different storage types and tiers with our virtual LUN capability. We found that providing both choices for customers allowed our products to meet a wider set of challenges than if we only offered just one of the two options.
The challenges addressed by traditional storage virtualization – which can be broadly categorized as simplified storage management ‐ still exist today. VPLEX local federation can solve this class of problems within the context of a single data center.
However, we’ve also seen these data center issues evolve. Newer, different problems have emerged that require new solutions – as we’ll see next, when we discuss distributed federation.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 10
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 10
VPLEX Metro: OverviewVPLEX Metro: Overview
• AccessAnywhere: Block storage access within, between and across data centers
• Within synchronous distances Approximately 60 miles or
100 Kilometers
• Connect two VPLEX storage clusters together over distance
• Enables virtual volumes to be shared by both clusters Provides unique distributed cache
coherency for all reads and writes
• Both clusters maintain the same identity for a volume, and preserve the same SCSI state for the logical unit
• Enables VMware VMotion over distance
Cluster‐1/Site A Cluster‐2/Site B
VPLEX Metro (Two Clusters)
With VPLEX distributed federation, it becomes possible to configure shared volumes to hosts that are in different sites or failure domains. This enables a new set of solutions that can be implemented over synchronous distances, where earlier these solutions could reside only within a single data center. VMware VMotion over distance is a prime example of such solutions.
Another key technology that enables AccessAnywhere is remote access. This makes it possible for block storage to be accessed as though it were local, even though it is remote.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 11
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 11
SharePoint 2007
Webfront end
Excel
Symmetrix CLARiiON Third‐Party
SAN
Domain 2/Site 2
Windows 2008 ServerMS Exchange
Mail_4File and print
server
VMFS Volume
Planned Events
Mail _3 Mail _2 Mail_1 Mail _2 Mail _1
Domain 1/Site 1
Example: Current ‐Workload Relocation within SitesExample: Current ‐Workload Relocation within Sites
Challenges:
• Uneven resource utilization across sites
• Planned events requiring shutdown
VMotion
SQL Server 2008
SQL 01
Symmetrix CLARiiON Third‐Party
MS Exchange MS Exchange
SQL 02Webfront end
SharePoint 2007
Web front end
Excel
SAN
VMFS Volume
Synchronous Distance100 Kms
This typical scenario deals with a dual‐site environment with virtualized Microsoft application servers at each site.VMotion can currently leverage shared SAN storage to move VMs across ESX servers within each site.
However, the customer is now looking to expand the scope of VMotion beyond site boundaries to further improve resource utilization, and to handle planned events that may affect an entire site.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 12
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 1212
SharePoint 2007
Webfront end
Excel
Symmetrix CLARiiON Third‐Party
SAN
Windows 2008 ServerMS Exchange
VMFS Volume
Domain 2/Site 2
Mail_4File and print
server
SQL 01
MS Exchange MS Exchange
Mail _3 Mail _2 Mail_1
SQL 02Webfront end
Web front end
Excel
SAN
Domain 1/Site 1
Proposed: VMotion Over Distance with VPLEX Proposed: VMotion Over Distance with VPLEX
Synchronous Distance100 Kms
FC MAN
Addressing the challenges:
• Distance VMotion: load‐balanceacross sites
• Planned site‐wide events: moveapplications pro‐actively to the other site
Distance VMotion
SQL Server 2008
Symmetrix CLARiiON Third‐Party
SharePoint 2007
VMFS Volume
VMFSvolume on distributed device
Mail_3 Mail_2 Mail_1
Other potential benefits:
• Disaster avoidance
• Improved infrastructure availabilityand performance
• Power/energy savings by moving VMsacross sites
The proposed solution can accomplish this as follows.
It involves a VPLEX Metro spanning sites, with the application VMs using shared data stores built on VPLEX distributed devices.
This enables non‐disruptive distance VMotion across sites,
thereby addressing the customer’s primary challenges.
Distance VMotion opens up other possibilities for this customer, as listed here.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 13
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 13
VPLEX Local: Single ClusterVPLEX Local: Single Cluster
• 1 to 4 Virtualization Engines per rack
• Up to 8,000 total Virtual Devices per cluster
• N+1 performance scaling
• Cache write‐through to preserve array functionalitySupported User Environments at General Availability
Host Platforms ESX, Windows, Solaris, AIX, HP‐UX, Linux
Multi‐Pathing PowerPath, VMware NMP
Volume Managers VxVM, AIX LVM, HPQ HP LVM
Arrays (at GA) VMAX, DMX, CLARiiON, HDS 99X0, USP‐V, USP‐VM
SAN Fabrics Brocade, McData and CiscoPower Supply
8 port FC SW
8 port FC SW
Power Supply
Power Supply
Power Supply
Switch UPSSwitch UPS
Management Server
Shown is a summary of the key characteristics of a VPLEX Local or single cluster configuration.
Among our key value propositions: you can start small and scale up, you can have centralized management, as well as predictable performance and availability.
The engines are arranged in a true cluster, which means I/O that enters the cluster from anywhere can be serviced from anywhere.
The engines are arranged in an N+1 configuration – which means that as you add more engines, you increase the memory, ports and performance of the total cluster. The cluster can withstand the failure of any device, and any component. The cluster will continue to operate and provide storage services as long as just once device survives. You get transparent mobility across heterogeneous arrays. If you have a need to extend these capabilities out over distance or across multiple failure domains within a single site, a VPLEX Metro configuration may be a more appropriate choice.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 14
© 2010 EMC Corporation. All rights reserved. Module 1: VPLEX Technology and Positioning 14
VPLEX Metro: Dual ClusterVPLEX Metro: Dual Cluster
Dual Cluster
Metro‐Plex
Power Supply
8 port FC SW
8 port FC SW
Power Supply
Power Supply
Power Supply
Switch UPSSwitch UPS
Management Server
Power Supply
8 port FC SW
8 port FC SW
Power Supply
Power Supply
Power Supply
Switch UPSSwitch UPS
Management Server
Up to 8 Virtualization Engines
16K (8K per cluster or shared) total Virtual Devices
Within or across Data Centers
Synchronous distance support
Here is a brief synopsis of VPLEX Metro configurations, limits and key capabilities.
As we saw with VPLEX Local, each single cluster can support 8000 backend Storage Volumes and 8000 Virtual Volumes, regardless of whether you specify 1, 2 or 4 engines. The number of engines influences the total number of FE/BE ports available, and thus scalability and obtainable performance relative to the number of hosts and storage array ports to be serviced. A VPLEX Metro Dual Cluster can support a total of 16000 front‐end and 16000 back‐end. However, when creating distributed RAID 1 Devices remember that you are consuming 2 devices, 1 from each cluster in the Metro, so if all devices are DR1s the limit is 8000 front‐end devices.
One view of a Metro‐Plex is each cluster servicing a different physical site, with up to 100 km between sites.
An equally useful alternate view is two joined clusters at a single site with shared LUNs between them. You may choose to implement these two clusters as two different targets within separate failure domains, for example, in the same data center.
At GA, VPLEX will support clustered host file systems including VMFS. With this deployment, multiple VMFS servers can read/write the same file system simultaneously, while individual virtual machine files are locked.We will also extend support over time to include: SUN Cluster, HP Cluster IBM Cluster and CXFS.
Currently there is a limitation for Stretch host clusters over distance: if one site fails, you need to perform a manual restart of the application on the failed site.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 15
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 15
Module 2: Architecture ‐ Physical and Logical Components
Module 2: Architecture ‐ Physical and Logical Components
Upon successful completion of this module, you should be able to:
• Provide a comprehensive view of VPLEX Local and VPLEX Metro
• Describe VPLEX hardware and software architecture at a high level
This module describes physical and logical components comprising a VPLEX system, the currently‐available federation features, and their internal operation.
This module describes the physical components and logical components comprising a VPLEX system.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 16
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 16
VirtualVol
VirtualVol
VirtualVol
Virtual Volumes
EMC and Non‐EMC Arrays
VPLEX Management Server
VPLEX ArchitectureVPLEX Architecture
VPLEX Directors
VPLEX Back‐end Ports
LCOM
VPLEX Front‐end Ports
Cluster‐1/Site A
VPLEX Engine
Hosts
VirtualVol
VirtualVol
VirtualVol
Virtual Volumes
EMC and Non‐EMC Arrays
VPLEX Management Server
VPLEX Directors
VPLEX Back‐end Ports
LCOM
VPLEX Front‐end Ports
Cluster‐2/Site B
Hosts
IP
FC MAN
Let's look at a typical production SAN environment, and how VPLEX fits and works within it.
The basic building block of a VPLEX system is the Engine. Multiple engines can be configured to form a single VPLEX cluster for scalability. Each Engine includes two High‐Availability Directors with front‐end and back‐end Fibre Channel ports for integration with the customer's fabrics. VPLEX does not rely on (or require) any particular fabric intelligence. The Director FE and BE ports show up as standard F‐ports on the fabrics. VPLEX technology can work equally well with Brocade or Cisco fabrics with no dependency on switching hardware or firmware. Directors within a cluster communicate with each other via redundant, private Fibre Channel links called LCOM links.
Each cluster includes a 1‐U Management Server with a public IP port for system management and administration over the customer’s network. The Management Server also has private, redundant IP network connections to each Director within the cluster.
VPLEX implementation fundamentally involves three tasks: presenting SAN volumes from back‐end arrays to VPLEX engines via each Director’s back‐end ports; packaging these into sets of VPLEX Virtual Volumes with the desired configurations and protection levels; and presenting Virtual Volumes to production hosts in the SAN via the VPLEX front‐end.
Currently a VPLEX system can support a maximum of two clusters. A dual‐cluster system is called a Metro‐Plex. For a dual‐cluster implementation, the two sites must be less than 100 km apart, with round‐trip latency of 5 msecs or less on the FC links. VPLEX clusters within a Metro‐Plex communicate via FC over the Directors’ FC‐MAN ports.
VPLEX implements a VPN tunnel between the Management Servers of the two clusters. This enables each Management Server to communicate with Directors in either cluster via the private IP networks. With this design, it’s possible to conveniently manage a Metro‐Plex from either of the two sites.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 17
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 17
VPLEX Engine: CharacteristicsVPLEX Engine: Characteristics
Host & Array Ports
Core Core
Core Core
Core Core
Core Core
CPU Complex
8Gb/s Fibre Channel
Global Memory
Host & Array Ports
Core Core
Core Core
Core Core
Core Core
CPU Complex
8Gb/s Fibre Channel
Global Memory
• Dual HA Directors per engine
• GeoSynchrony software runs on each Director to provide VPLEX features and functionality
• 32‐ 8GB/s Fibre Channel FE/BE ports
For fabric connectivity to hosts and storage arrays
• Fibre Channel interconnect between Directors
• Intel multi‐core CPUs
• 64GB (raw) of cache memory
• Redundant power supplies
• Integrated battery backup
• Built in “Call Home” support
The engine itself it designed with a very highly available hardware architecture. It hosts two Directors with a total of 32 Fibre Channel ports, 16 FE and BE. All major engine components are redundant.
The engine is built for performance with a large cache, and has fully redundant power supplies, battery backups and EMC Call Home capabilities to align with our support best practices.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 18
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 18
Distributed Cache CoherencyDistributed Cache Coherency
Cache Directory D Cache Directory F Cache Directory HCache Directory BCache Directory C Cache Directory E Cache Directory G
Cache Cache Cache Cache
Engine Cache Coherency Directory
Block Address 1 2 3 4 5 6 7 8 9 10 11 12 13 …
Cache A
Cache C
Cache E
Cache G
Engine Cache Coherency Directory
Block Address 1 2 3 4 5 6 7 8 9 10 11 12 13 …
Cache A
Cache C
Cache E
Cache G
Cache Directory A
New Write:Block 3
Read:Block 3
Host Host
The VPLEX environment is dynamic and uses a hierarchy to keep track of where I/Os go.
An I/O request can come from anywhere and will be serviced by any available engine in the VPLEX cluster. VPLEX abstracts the ownership model into a high level directory that's updated for every I/O, and shared across all engines. The directory uses a small amount of metadata, and tells all other engines in the cluster, in 4k blocks, which block of data is owned by which engine and at what time. The communication that actually occurs is much less than the 4k blocks that are actually being updated.
If a read request comes in, VPLEX automatically checks the directory for an owner. Once the owner is located, the read request goes directly to that engine.
Once a write is done and the table is modified, if another read request comes in from another engine, it checks the table and can then pull the read directly from that engine's cache. If it's still in cache, there is no need to go to the disk to satisfy the read. This model also enables VPLEX to stretch the cluster, as we can distribute this directory between clusters and therefore, between sites. The design has minimal overhead, is very efficient, and enables effective communication over distance.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 19
VPLEX
19
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 1919
Hardware Components: EngineHardware Components: Engine
Directors
•Front‐end ports provide active/active access to virtual volumes
•Process FibreChannel SCSI commands from hosts
VPLEX Engine Front
Director A
Director BVPLEX Engine Back
The two directors within a VPLEX engine are designated “A” and “B”. Director A is below Director B. Each director contains dual Intel Quad‐core CPUs that run at 2.4 GHz, 32 GB of read cache memory and a total of (16) 8 Gbps FC ports, 8 front‐end and 8 back‐end. Both directors are active during cluster operations.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 20
VPLEX
20
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2020
Hardware Components: I/O ModulesHardware Components: I/O Modules
Front‐End Back‐End
COM COMGigEGigE
There are a total of 12 I/O modules in a VPLEX engine. 10 of these modules are Fibre Channel and 2 are GigE. The Fibre Channel ports can negotiate up to 8 Gbps. Four FC modules are dedicated for front‐end use and four for the back‐end. The two remaining FC modules are used for inter/intra cluster communication. The two GigE I/O modules are not utilized in this release of VPLEX.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 21
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2121
Hardware Components: DAEHardware Components: DAE
Internal DAE behind screen
Internal DAE with screen removed
SSD Drive Carrier
VPLEX internal SSDs can be accessed from the front of a VPLEX system. Each director is assigned one SSD, and boots from it. SSDs reside within an SSD Drive Carrier behind the DAE screen. Each SSD Drive Carrier can hold two 2.5 inch SSDs. However, only one SSD is installed per drive carrier. Each SSD has a drive capacity of 30 GB.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 22
VPLEX
22
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2222
Hardware Components: I/O Module CarrierHardware Components: I/O Module Carrier
I/O Module Carrier
A VPLEX engine contains two I/O Module carriers, one for Director A and one for Director B. The one on the right is for Director A and the one on the left for Director B. There are two I/O modules per carrier. The one that is shown in this picture contains a Fibre Channel module and a GigE module. As we just discussed, the Fibre Channel module is used for inter‐ and intra‐cluster communication within a VPLEX system.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 23
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2323
Hardware Components: I/O Module Types Hardware Components: I/O Module Types
FC IOM
• 4 port 8 Gbps Fibre Channel IOM
• Used for FC COM and FC WAN connectivity with an I/O Module carrier
I/O Module Carrier
FC IOM
0 1 2 3
This is the FC I/O Module from an I/O Module Carrier which is used for inter‐ and intra‐ cluster communication. In this module, Ports 0 and 1 are used for local COM. Ports 2 and 3 are used for WAN COM between clusters in a Metro‐Plex. In medium and large configurations, FC I/O COM ports run at 4 Gbps. In terms of physical hardware, this FC I/O module is identical to the I/O modules used for front‐end and back‐end connectivity in the director slots.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 24
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2424
Hardware Components: Management and PowerHardware Components: Management and Power
• Allows for daisy chain connection between engines within a cluster
• USB port unused
Power Supplies
Management Modules
Each engine contains two management modules and two power supplies. Each management module contains two serial ports and two Ethernet ports. The upper of the two serial ports is open, and can be utilized by EMC field personnel for BIOS and POST access. The lower serial port ships pre‐cabled. It is used to monitor the SPS and UPS. The Ethernet ports are used to connect to the Management Server and also to other Directors within the cluster, in a daisy‐chain fashion.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 25
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2525
Hardware Components: VPLEX Management ServerHardware Components: VPLEX Management Server
Central Point of Management
The VPLEX Management Server is the central point of management for a VPLEX Local and VPLEX Metro system. It ships with a dual‐core Xeon processor, a 250 GB SATA near‐line drive and 4 GB of memory. The Management Server interfaces between the customer network and the VPLEX cluster. It isolates the VPLEX internal management networks from the customer LAN. It communicates with VPLEX firmware layers within the directors over the private IP connections. A Management server ships with each VPLEX cluster.
Note that the loss of a Management Server does not impact host I/O to VPLEX provided virtual storage. Within a Metro‐Plex there are two Management servers, “one for each cluster”. Both clusters can be controlled from either Management Server. A Metro‐Plex utilizes a secure management connection between the two Management Servers via VPN connection. A VPLEX cluster can be controlled through the Management Console which runs on the Management Server.
The Management Server also enables remote support via an ESRS Gateway. With this functionality in place, VPLEX is able to send Call Home events and system reports to the ESRS Gateway.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 26
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2626
Hardware Components: Fibre Channel COM SwitchesHardware Components: Fibre Channel COM Switches
Connectrix DS‐300B: creates a redundant Fibre Channel network for COM
Connectrix DS‐300B switches are used for intra‐cluster communication in a VPLEX medium or large configuration. A pair of DS‐300B switches ship pre‐cabled, with medium or large configurations. These switches create redundant Fibre Channel networks for the internal LCOM connections. Each director has two independent LCOM paths to every other director. A VPLEX medium configuration uses 4 ports per switch and a VPLEX large configuration uses 8 ports per switch. 16 ports remain disabled, unused and unlicensed. Each port runs at 4 Gbps. The LCOM networks are completely private ‐ no customer connections are permitted on these switches. Each Connectrix DS‐300B utilizes an independent UPS.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 27
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 27
VPLEX Local: Supported ConfigurationsVPLEX Local: Supported Configurations
SPS SPS
Management Server
Engine 1
Single Engine
SPS SPS
SPS SPS
UPS B
Management Server
Engine 1
Engine 2
UPS A
FC Switch B
FC Switch A
Dual Engine
SPS SPS
SPS SPS
SPS SPS
SPS SPS
UPS B
Management Server
Engine 1
Engine 2
Engine 3
Engine 4
UPS A
FC Switch A
FC Switch B
Quad Engine
All supported VPLEX configurations ship in a standard, single rack.
The shipped rack contains the selected number of engines, one Management Server, redundant Standby Power Supplies (SPSs) for each Engine and any other needed internal components. For the dual and quad configurations only, these include redundant internal FC switches for LCOM connection between the Directors. In addition, dual and quad configurations contain redundant Uninterruptible Power Supplies (UPSs) that service the FC switches and the Management Server.
The software is pre‐installed, the system is pre‐cabled, and also pre‐tested.
Engines are numbered 1‐4 from the bottom to the top. Any spare space in the shipped rack is to be preserved for potential engine upgrades in the future. The customer may not repurpose this space for unrelated uses. Since the engine number dictates its physical position in the rack, numbering will remain intact as engines get added during a cluster upgrade.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 28
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 28
Configurations at a GlanceConfigurations at a Glance
256 GB128 GB64 GBCache
22NoneUninterruptible Power Supplies (UPS)
22NoneInternal FC switches (For LCOM)
111Management Servers
643216BE Fibre Channel ports
643216FE Fibre Channel ports
YesYesYesRedundant Engine SPSs
842Directors
Quad EngineDual EngineSingle Engine
Start Small and Transparently Scale Out Engines
This table provides a quick comparison of the three different VPLEX single cluster configurations available at GA.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 29
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 2929
VPLEX Management: IP InfrastructureVPLEX Management: IP Infrastructure
Management Server
EMC VPLEX Cluster
HTTPS or SSHManagement
Client
CustomerLAN
Director
Director
Director
Internal IP
Network
Internal IP
Network
Shown is a high‐level architectural view of single cluster management. The Management Server is the only VPLEX component that gets configured with a “public” IP on the customer network.
From the customer network, the Management Server can be accessed by a VPLEX storage administrator via an SSH session. Within the SSH session, the administrator can run a CLI utility, called VPlexcli, to manage all aspects of the cluster. A browser‐based GUI is also available.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 30
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 3030
VPLEX ManagementVPLEX Management
VPLEX Management Console (GUI)
VPlexcli (CLI)
VPLEX provides two ways of management, the “VPlexcli and the VPLEX Management Console.” The VPlexcli can be accessed via a telnet session to TCP port 49500 on the Management Server. The VPLEX Management Console is accessed by pointing a browser at the Management Server IP using the https protocol. Currently VPLEX CLI is the more mature interface providing complete support for all documented features and functionality. The management console has known limitations in some areas. For example, mobility operations can only be performed using CLI.
Every time the VPlexcli is accessed, it creates a session log in the /var/log/VPlex/cli/ directory. Logging in through the Management Console also creates a session file in /var/log/VPlex/cli. VPLEX Management Console
Via https session to the Management Server
Intuitive, easy‐to‐use interface for simplified storage management
Incorporates comprehensive online help
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 31
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 31
VPLEX Federation: ConstructsVPLEX Federation: Constructs
Extent
Dev
Extent
Dev
Storage Vol
Extent
Storage Vol
Storage Vol
Let’s examine the various types of managed storage objects within EMC VPLEX, their inter‐relationships, and how they relate to entities external to VPLEX – such as customer hosts and customer storage arrays.
Back‐end storage arrays are configured to present LUNs to VPLEX backend‐ports.
Each presented back‐end LUN maps to one VPLEX Storage Volume. Storage Volumes are initially in the “unclaimed” state. Unclaimed storage volumes may not be used for any purpose within VPLEX other than to create meta‐volumes, which are for system internal use only.
Once a Storage Volume has been claimed within VPLEX, it may be carved into one or more contiguous Extents. A single Extent may map to an entire Storage Volume; however, it cannot span multiple Storage Volumes.
A VPLEX Device is the entity enables RAID implementation across multiple storage arrays. VPLEX supports RAID‐0 for striping, RAID‐1 for mirroring, and RAID‐C for concatenation. The simplest possible device is a single RAID‐0 device comprising one extent, as shown here.
Shown next is a more complex device – for example a striped RAID‐0 device across two extents. Note that the underlying extents could even be from multiple backend storage arrays.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 32
© 2010 EMC Corporation. All rights reserved. Module 2: Architecture - Physical and Logical Components 32
Storage View
VPLEX Federation: ConstructsVPLEX Federation: Constructs
Dev
Extent
Dev
Storage Vol
Extent
Storage Vol
Dev
Port
Port
VPLEX Front‐End Port
Initiator
Initiator
Host
VirtualVol
Top Level Device (TLD)
Extent
Storage Vol
Devices may be layered on top of other devices. For example, we could create a RAID‐1 mirrored device with two dissimilar mirror legs, as shown in this example. Only devices at the top‐level may have a front‐end SCSI personality and be presented to hosts. These are called Top Level Devices.
“Storage View” is the masking construct that controls how virtual storage is exposed through the front‐end. An operational Storage View is configured with three sets of entities as shown next.
First, any hosts that the Storage View must present storage to should have one or more initiator ports (HBAs) in the Storage View. Host initiators should be registered with one of several specifically recognized and supported host personality types within VPLEX, such as “default” which corresponds to most open systems hosts: Windows and Linux, HP‐UX, and VCS. A high‐availability host should have a minimum of two registered initiator ports each within its Storage View.
Second, one or more VPLEX front‐end ports needs to be configured as part of the Storage View. A typical high‐availability configuration would use a minimum of one front‐end port per fabric, each of them servicing a separate host initiator.
Third, a Virtual Volume that maps to the appropriate Top Level Device needs to be created and then configured as part of the Storage View.
Once a Storage View is properly configured as described and operational, the host should be able to detect and use Virtual Volumes after initiating a bus‐scan on its HBAs. Every front‐end path to a Virtual Volume is an active path, and the current version of VPLEX presents volumes with the product ID “Invista”. The host requires supported multi‐pathing software in a typical high‐availability implementation.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 33
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 33
Module 3: VPLEX Functionality and Management
Module 3: VPLEX Functionality and Management
Upon successful completion of this module, you should be able to:• Describe local federation capabilities within a VPLEX cluster • Describe distributed federation capabilities in a Metro‐Plex• Explain the VPLEX internal data flow operations for host‐to‐storage I/O under various scenarios
• Describe key VPLEX administration and maintenance features
This module describes core VPLEX product functionality available at GA.
This module provides a detailed look at the core VPLEX capabilities that are available at GA.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 34
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 34
Provisioning: Using the VPLEX Management ConsoleProvisioning: Using the VPLEX Management Console
Tasks
Provisioning Overview
Provision Storage
This is the home section of the EMC VPLEX Management Console. This is a good logical starting point for many VPLEX management operations.
On the right of the screen there are storage provisioning steps. These steps are also links that will redirect a person to the page to implement the step.
On the left of the screen there is a picture showing the task sequence to provision virtual volumes out of VPLEX. To the right of the Home button, there are two more links, “Provision Storage” and “Help” The Provision Storage link will take the user to an alternative page from which provisioning can be implemented. The Help link will take the user to the VPLEX Online Help page.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 35
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 35
Brown‐field Implementation: EncapsulationBrown‐field Implementation: Encapsulation
• EMC VPLEX maintains physical separation of metadata from host data VPLEX metadata is stored separately on metadata volumes
Basis for simple data‐in‐place mobility
• High level steps: Present native array LUN with existing data to VPLEX back‐end
Claim the LUN as a storage volume from VPLEX
Create one extent consisting of the entire storage volume
Create a RAID‐0 device on the extent
Create a Virtual Volume on the device
Un‐provision native LUN from host
Present VPLEX Virtual Volume to host
One time disruption to host
Encapsulation: the process of converting existing production SANvolumes on hosts to VPLEX volumes, via “one‐for‐one” mapping
Encapsulation is basically “data‐in‐place” migration of existing production data into VPLEX, and thereforedoes not require any additional storage. Encapsulation is disruptive since you cannot simultaneously present storage both through VPLEX and directly from the storage array without risking data corruption, due to read‐caching at the VPLEX level.
You have to cut‐over from direct array access to VPLEX virtualized access. This implies a period where all paths to storage are unavailable to the application. With proper planning and execution, this downtime can be minimized. When PowerPath Migration Enabler (PPME) support is put in place, it can help eliminate any disruption.
An alternative migration strategy for existing production hosts is to perform host based replication from native‐array volumes to net‐new VPLEX volumes. This is non‐disruptive but requires additional storage. Host‐based copy also consumes cycles on the host, and may need to be planned in a live production environment.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 36
TitleMonth Year
36
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 36
Fabric B
Fabric A
Array Storage Volumes found:
VPD83T3:600601606bb02500aab2affa35b5de11
VPD83T3:600601606bb025006a17a18d5bfade11
VPD83T3:600601606bb02500ba7b6b1c49fade11
Host Initiator Ports detected:
UNREGISTERED‐0x10000000c987422a
UNREGISTERED‐0x10000000c987422b
Virtual Volumes detected
Host
Encapsulation: Migrating a Host to VPLEXEncapsulation: Migrating a Host to VPLEX
This example illustrates the process of cutting over from native SAN volumes to VPLEX volumes via encapsulation. Observe the system state transitions as you step through this task sequence.
The basic idea is to logically integrate VPLEX into your production fabrics between your hosts and storage arrays.
To do this, the back‐end ports of VPLEX are first connected to the production fabrics.
Via suitable zoning and LUN masking, VPLEX back‐end ports, which are technically initiators, detect the back‐end storage arrays and volumes. Native array volumes or LUNs are then claimed by VPLEX, allowing your storage administrator to layer VPLEX virtual volumes on them for presentation to hosts.
Front‐end configuration is the next logical step. VPLEX front‐end ports are connected to the fabrics, and the zoning configuration modified to allow hosts to detect these ports as targets.
Once this is done, VPLEX can detect the host initiators (HBAs) which should then be registered with the appropriate host personality.
At this point, by creating a suitable storage view within VPLEX, it becomes possible to present VPLEX volumes to the host initiators. Note that in this process, the original SAN volumes from the array are now repackaged as VPLEX volumes and presented via new FC targets, (i.e. the VPLEX FE ports). The recommendation is to remove host access to the original SAN volumes, before presenting the encapsulating VPLEX volumes.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 37
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 37
Storage Provisioning: DevicesStorage Provisioning: Devices
• RAID‐1 – Mirrored VPLEX DeviceUse arrays from the same tier
Ideal for nesting other devices
• RAID‐0 – Striped VPLEX DeviceIdeal for encapsulated devices
Consider stripe depth
Avoid striping striped storage volumes
• RAID‐C – Concatenated VPLEX DeviceMost flexible to grow
Extent
Dev
Extent
Dev
Extent
Dev
Dev Dev
Dev
Dev
The VPLEX “device” construct forms the basis of core RAID capabilities supplied by VPLEX. The key value‐add is that VPLEX can enable RAID functionality across storage arrays.
A RAID‐1 VPLEX Device mirrors data to two extents or devices.
A RAID‐0 VPLEX Device stripes data across multiple extents or devices. Simplest possible device is a RAID‐0 device that uses one extent. This is typically what you’d configure during encapsulation.
A RAID‐C VPLEX Device concatenates multiple extents or devices.
Viewing these as building blocks allows you to consider an organized system of device “nesting” to meet your customer’s specific needs.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 38
VPLEX
38
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 3838
Provisioning: Multi‐pathing with EMC PowerPathProvisioning: Multi‐pathing with EMC PowerPath
By default, EMC VPLEX volumes appear with vendor ID “EMC” and product ID “Invista”. Thus, any version of PowerPath that can manage Invista volumes, can also recognize and manage EMC VPLEX volumes. This example shows a Virtual Volume on a front‐end Linux host, as reported by PowerPath. Note that the default load‐balancing policy with PowerPath for a VPLEX volume is ADaptive. Other multi‐pathing options, including native OS multi‐pathing are discussed later, in the Planning and Design module.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 39
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 39
Extent MobilityExtent Mobility
1010101101
• Mobility of block data across extents, non‐disruptive to the host
• Extent mobility can only be performed within a cluster
• Original extent is freed up for reuse
• Fundamental use: non‐disruptive data mobility across heterogeneous storage arrays
VirtualVol
Storage Vol
Extent
DEV
Extent
Storage Vol
Host
1010101101
VPLEX Local supports mobility of Extents – potentially across storage array frames – that is completely transparent to any layered virtual volume that is actively servicing I/O requests from a host.
As this example shows, the device‐to‐extent mapping changes at the end of a committed Mobility operation. However, the host to which the volume is provisioned is not even aware of this change.
Note that extent mobility requires that both the source extent and the target extent belong to the same VPLEX cluster.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 40
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 40
Device MobilityDevice Mobility
Storage Vol
ExtentExtent
VirtualVol
Extent
Storage Vol
Extent
Storage Vol Storage Vol
DEVDEV
Host
1010101101 1010101101
Another Mobility option with VPLEX Local is mobility at the device level. This could be used for example to move data across disparate storage arrays, or even to change the RAID level of a device without disruption.
Device mobility is supported across clusters as well, in a Metro‐Plex environment.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 41
VPLEX
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 4141
Mobility: Typical Task SequenceMobility: Typical Task Sequence
1. dm migration start –n <name> -f <extent/device> -t
<extent/device>
RAID‐1
Source Device or Extent
1010101101
Target Device or Extent
1010101101
2. dm migration commit -m <name> --force
3. dm migration clean -m <name> --force
4. dm migration remove -m <name> --force
There are four basic operations involved in moving extents or devices. They are: start, commit, clean, and remove. Data mobility is accomplished by using RAID‐1 operations.
The start operation first creates a RAID‐1 device on top of the source device. It specifies the source device as one of its legs and the target device as the other leg. It then copies the source’s data to the target device or extent. This operation can be canceled as long at it is not committed.
The commit operation removes the pointer to the source leg. It is not best practice to commit the operation immediately.
At this point in time the target device is the only device accessible through the Virtual Volume.
The clean operation breaks the source device down all the way to the storage volume level. This operation is optional. However, the data on the source device is not deleted.
The remove operation removes the record from the mobility operation list. Data mobility operations can also be paused and resumed. These commands may be used in conjunction with the VPLEX scheduler to mitigate or eliminate disruption to production I/O.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 42
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 42
Batched MobilityBatched Mobility
• Enables scripting of extent and device mobility
• A batch can process either extents or devices, but not a mix of both
Task sequence for batched mobility:
1. Create migration plan: batch-migrate create-plan plan.txt -f <source> -t <destination>
2. Check plan for errors: batch-migrate check-plan plan.txt
3. Start migration, copy data to targets: batch-migrate start plan.txt
4. Commit migration: batch-migrate commit plan.txt
5. Clean up migration: batch-migrate clean –file plan.txt
6. Remove migration record: batch-migrate remove
Batched mobility provides the ability to script large‐scale migrations without having to specify individual extent‐by‐extent or device‐by‐device migration jobs.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 43
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 43
AccessAnywhere with VPLEX MetroAccessAnywhere with VPLEX Metro
Remote DeviceDistributed Device
Cluster‐1/Site A Cluster‐2/Site B
Synchronous Distance
StorageArray
StorageArray
Distributed Device
Virtual Volume
Host Host
Cluster‐1/Site A Cluster‐2/Site B
Synchronous Distance
Device
StorageArray
StorageArray
Host Host
Virtual Volume
AccessAnywhere provides a logical device with full read/write access to multiple hosts at multiple locations – with the current release, separated by synchronous distance up to 100 km.
A key enabling VPLEX Metro technology for AccessAnywhere is distributed mirroring. It enables you to configure a RAID‐1 mirrored device with two legs, one on each cluster. Hosts at either site may issue I/O to this shared volume concurrently. Distributed coherent shared cache preserves data integrity of this volume.
This mirrored device has the same volume identity at both clusters, while being presented via distinct FC targets (i.e., VPLEX FE ports at each cluster).
Another enabling VPLEX Metro technology for AccessAnywhere is remote access.
This allows a device configured on one site to be presented to initiators on the other site for full read/write access. For remote exports,
VPLEX use of sequential read‐detection logic within the caching layer can significantly improve performance. Feasible configurations therefore include hosts with no SAN storage within their local site.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 44
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 44
Distributed Device: I/O OperationDistributed Device: I/O Operation
ACK
FC MAN
VirtualVolume
Host in Cluster‐2/Site B writes data to shared volume.Data is written through cache to Back‐end storage.Data is acknowledged by Back‐end arrays.Data is acknowledged to host once written to disk.
10110…
10110…10110…10110…
ACK
ACK ACK
Synchronous Distance
Distributed device
Host Host
StorageArray
StorageArray
VPLEX Cluster‐1/Site A VPLEX Cluster‐2/Site B
Let’s examine the mechanics of I/O access of each of these enabling technologies in greater detail.
With a distributed device, when a host issues a write to the device, the data is placed in the cache of the ingress Director.
And, then written through to storage arrays at both sites. Only after the storage arrays have acknowledged write completion
does the host get the ack for “write‐complete” from VPLEX.
This design completely eliminates the risk of losing host data in the event of VPLEX component failures.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 45
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 45
Remote Device: I/O OperationRemote Device: I/O Operation
Host in Cluster‐2/Site B writes data to volume
10110…
10110…
READ
READ 10110…10110…
11001…
11001… 11001…ACK
ACK
Data is acknowledged to host once written to disk
ACK
Host in Cluster‐1/Site A reads data from volume.Host in Cluster‐1/Site A writes data to volume.Data is acknowledged to host once written to disk.
Synchronous Distance
StorageArray
StorageArray
Host Host
VPLEX Cluster‐1/Site A VPLEX Cluster‐2/Site B
VirtualVol
VirtualVol
FC MAN
With remote access:
Writes from hosts on the same cluster as the exported device work the same as writes to any local device – then written to the back‐end array, before the acknowledgement is sent to the host.
Reads from remote hosts can effectively exploit local cache, remote cache and sequential read‐ahead for near‐local performance.
�For a write from a remote host: the new data is cached at the remote site. Existing data in the local cache is invalidated with an RPC message; then the new data is sent to the local site, and written to the back‐end storage.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 46
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 46
Distributed Device: Handling “Split‐brain”Distributed Device: Handling “Split‐brain”
Consider a distributed system with two sites:
From Site A’s perspective the following two conditions are indistinguishable:
Addressing this is fundamental to the design of distributed applications.With Metro‐Plex distributed device: handled with a configurable “detach rule”
Site A Site BFC‐MAN
Site A Site BFC‐MANSite A Site BFC‐MAN
Partition Failure Site Failure
Let’s examine the logistics of failure handling in a Metro‐Plex environment.
There are two types of failures in a Metro‐Plex, partition failures and site failures. Partition failures typically occur more often than site failures. However, from one site’s point of view, both partition failures and site failures are handled the same way. Metro‐Plex handles both types of failures using detach rules, as we’ll see next.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 47
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 47
Distributed Device: Configuring Detach RuleDistributed Device: Configuring Detach Rule
Can specify a pre‐defined rule set or customized rule set
Failure handling behavior is configured by tying a specific “detach rule” to each distributed device. In the example shown, the ruleset “cluster‐1 detaches” implies that upon failure, if cluster‐1 survives then it will continue to provide read/write access to the volume, while cluster‐2 will suspend I/O activity to this device at the other site. The detach rule can be changed by selecting the distributed device’s supporting device and then selecting a different cluster to detach from. Detach rules may be customized to meet specific needs.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 48
© 2009 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 48
Distributed Devices: Supported Detach OptionsDistributed Devices: Supported Detach Options
Detach options currently supported with VPLEX distributed devices in a Metro‐Plex:
• Biased‐site detach
• Non‐biased site detach
• Manual detach
Use with automated script on production host(s) to activate read/write access from either site, after a failure event
There are three major categories or approaches to detach rules.
Either biased‐site detach or non‐biased site detach are simple to implement, with pre‐defined rule sets in place. Either of these may adequately address the customer’s needs. For example, when one site can be clearly viewed as the production site while the other is secondary, within the context of a given DR1.
To enable complete control of the VPLEX DR1 environment from a stretched host cluster, the use of “manual detach” with scripting is recommended.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 49
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 49
Monitoring: VPLEX PerformanceMonitoring: VPLEX Performance
• Creating monitorsmonitor create --name <name> --period <time> --director <Director_Name> --stats <stat>
• Listing monitors
• Destroying monitorsmonitor destroy <monitor>
Performance data can be collected on the VPLEX system by creating monitors and sinks. Monitors collect performance statistics on various VPLEX components. These monitors are created within the VPlexcli using the monitor command. By default, monitors collect statistics every 30 seconds. This collection time can be modified if desired.
Once a monitor is created, it can be found in the /monitoring directory. Monitors only start collecting data when they have at least one associated “sink”, as we’ll see next. Monitors can be destroyed using the monitor destroy command.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 50
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 50
Monitoring: VPLEX Performance (Cont’d)Monitoring: VPLEX Performance (Cont’d)
• Listing statistics available for monitoringmonitor stat-list
• Monitor collect Updates a performance monitor immediately
Ad‐hoc manual collect of data
• Supported monitor “sink” types: console, file, SNMP
• Adding sinks for monitorsmonitor add-file-sink –n <name> -f <file_location> -m <monitor_to_add>
• Removing a sinkmonitor remove-sink <sink>
To be able to activate and view the statistics collected by a monitor, at least one sink must be created. Sinks are files created to hold output from monitors. Sink files can then be uploaded to other programs such as MS Excel to better view the information collected. Three different types of sinks can be created, “console, file, and SNMP”. SNMP sinks are not supported. Sinks are composed of comma separated values and therefore a .CSV extension is a useful file name extension. Console sinks have limited use because they interfere with console typing.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 51
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 51
Monitoring: Event Handling and Report GenerationMonitoring: Event Handling and Report Generation
ESRS Gateway
Management Server
ConnectEMCCall Home Listener
SYREMA_Adaptor
VPlexcli
Engine
TCP ports 22, 9010, 443 and
5901
Shown is the high level architecture of event handling and messaging flow from the Engine to the Management server, to a properly configured ESRS gateway.
VPlexcli, which runs on the Management Server, pulls events every second from a process on a Director. The Call Home Listener on the Management server looks at the events and determines, which events should initiate a call home. It then places those events into the /opt/emc/VPlex/Event_Msg_Folder directory as .txt files.
The EMA_adaptor’s job is to take the text files from the Event_Msg_Folder directory and create the required XML files using the EMA API. The EMA_adaptor then places those files into the /opt/emc/connectemc/poll directory. The ConnectEMC process picks up the XML event files and sends them to ESRS Gateway. If the events are successfully sent to the gateway, they are also copied into the /opt/emc/connectemc/archive directory. If transmission fails for some reason, the corresponding events are placed into the /opt/emc/connectemc/failed directory.
TCP ports 22, 9010, 443, and 5901 must be open between the Management Server and the ESRS Gateway. The ESRS Gateway classifies incoming events as belonging to this VPLEX instance via the “Top‐Level Assembly” field within each event. The “Top‐Level‐Assembly” is a cluster‐unique identifier that is preset at the factory on all engines of a VPLEX cluster.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 52
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 52
Generating System Reports: SYRGenerating System Reports: SYR
SYR generates a complete report of the VPLEX System
syrcollectManually run SYR
scheduleSYR listList SYR
Task Command
Configure SYRSends a weekly report to the ESRS Gateway
scheduleSYR add -d <day> -t <hour> -m <minute>
SYR is a process that collects VPLEX system reports to send to the ESRS gateway. SYR reports use the same directories as ESRS events. SYR can be run manually using the syrcollect command; or it can be run at a scheduled time using the scheduleSYR command. SYR reports are sent to the ESRS Gateway by the ConnectEMC process. Once SYR has been scheduled, it will run weekly at the scheduled time.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 53
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 53
Collecting VPLEX Log FilesCollecting VPLEX Log Files
collect-diagnostics
• Collects logs, cores, and configuration information from the Management Server and the directors
• Places a tar.gz file in /diag/collect-dianostics-out
The collectdiagnostics command can be used when attempting to troubleshoot VPLEX issues. This command will produce a tar.gz file containing logs, cores and configuration information about the Management Server and directors within a VPLEX system. This file is very large and should moved off the system once it has been generated.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 54
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 54
Scheduling: “cron”‐styleScheduling: “cron”‐style
schedule
manage and control timing of specific tasks
The VPlexCLI schedule command may be used to run commands in batch mode at an arbitrary time, or periodically on a schedule. This can be particularly useful to offload certain types of activity ‐ for example mobility ‐ to off‐production hours.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 55
© 2010 EMC Corporation. All rights reserved. Module 3: VPLEX Functionality and Management 55
Maintenance: Non‐disruptive Code Upgrade (NDU)Maintenance: Non‐disruptive Code Upgrade (NDU)
• NDU process for VPLEX: code upgrades with no disruption to production hosts performing I/O to VPLEX virtual volumes
• Requires best practices to be followed for host connectivity, and supported multi‐pathing software
• Uses a notion of “first upgraders” and “second upgraders” First: Director A of every engine is upgraded, then rebooted
Second: Director B of every engine is upgraded, then rebooted
VPLEX Metro upgrade: Both cluster are upgraded with a single “ndu”operation issued on one Management Server
These are the steps to perform an NDU. I/O will continue while one side of an engine is being upgraded. The time to complete an NDU should be relatively the same regardless of the number of engines in the system. This is because an NDU will upgrade all A directors and then all B directors at once.
First upgraders: Every engine’s A directors are upgraded A directors’ firmware is shutdown during the upgrade
I/O is automatically redirected to B directors
Once upgraded, A directors reboot
A directors begin serving I/O again
Second upgraders: Every engine’s B directors are upgraded B director’s firmware is shutdown during the upgrade
I/O is automatically redirected to A directors
Once upgraded, B directors reboot
B directors begin serving I/O again
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 56
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 56
Module 4: Planning and Design Considerations
Module 4: Planning and Design Considerations
Upon successful completion of this module, you should be able to:
• Perform planning and design for VPLEX deployment
• State and explain the rationale for recommended best practices with VPLEX implementations
This module covers key planning and design considerations relevant to VPLEX solutions.
This module covers Planning and Design considerations during deployment of a VPLEX solution.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 57
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 57
FE
FE
BE
BE
VPLEX Physical Connectivity: SAN Best PracticesVPLEX Physical Connectivity: SAN Best Practices
Volume 2
Volume 1
Hosts
Arrays
• Deploy mirrored fabrics
• Connect every host and every storage array to both fabrics
• For each VPLEX Director, distribute front‐end ports over both fabrics
• For each VPLEX Director, distribute back‐end ports over both fabrics
• For each FE module and BE module, distribute ports over both fabrics
Fabric B
Fabric A
When deploying the VPLEX cluster, the general rule is to use a configuration that provides the best combination of simplicity and redundancy. In many instances connectivity can be configured to varying degrees of redundancy. However, there are some minimal requirements that should be met.
Deploy mirrored fabrics: this is standard EMC practice. In addition, it is preferable to isolate the front‐end fabrics from the back‐end fabrics. This would ensure clean separation of hosts from storage arrays. This is appropriate in environments where all encapsulation of existing production data is complete, and any future provisioning to hosts will be exclusively from VPLEX.
Connect every host and every storage array to both fabrics.
Each Director should be assigned ports to both fabrics otherwise, a fabric failure could reduce the paths and computing power of the VPLEX. This will double the workload for the surviving Directors. Distribute FE ports of each director over both fabrics.
Distribute BE ports of each director over both fabrics.
The above two rules ensure the following: if there is complete outage on one fabric, that does not render a Director completely non‐operational on either the front‐end or on the back‐end.
Thus the processing power of the VPLEX system is not compromised by a fabric outage.
Distribute the four ports of each I/O module over both fabrics.
Again this minimizes loss of VPLEX efficiency and processing power in the event of complete failure on one fabric.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 58
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 58
• Each director must be provided access to every BE volume in the cluster
• Active/Active array: For each director, provide at least one BE path to each volume via each fabric
• Active /Passive array: For each director, provide BE paths via both controllers to each LUN via each fabric
• VPLEX BE port “initiator personality” – open systems host, use failovermode=1 with CLARiiON arrays
VPLEX Logical Connectivity: Back‐endVPLEX Logical Connectivity: Back‐end
VMAX
Volume
CX4‐960
LUNA0
A1
B0
B1
Fabric B
Fabric A
It is a requirement that each Director have at least one viable, active path to every Storage Volume in a VPLEX cluster.
This means, to be usable a Storage Volume must be presented to every Director in the same cluster.
For active/passive storage arrays, make sure that a given BE port of a Director has both active and passive paths to the storage volume.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 59
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 59
VPLEX Logical Connectivity: Front‐endVPLEX Logical Connectivity: Front‐end
Engine 2
Engine 1
Engine 2
Director B
Engine 1
Director A
Director A
Director B
• Single Engine configuration: For each host, configure FE paths to both Director A and Director B
• Dual Engine and Quad Engine configuration: For each host, configure FE paths to A and B of separate engines
Fabric B
Fabric A
Volume 2
Volume 1
Hosts
Arrays
Front‐end hosts should be configured with paths to VPLEX front‐end ports, which serve as virtualization targets, via separate fabrics. In a single engine system, configure at least one front‐end path to each director. This enables the host to maintain I/O access to VPLEX volumes during an NDU code upgrade.
With dual engine or quad engine systems, additional resiliency can be obtained by using “A” and “B”directors on distinct engines. This would ensure that the host does not lose I/O access to volumes even during planned or unplanned shutdown of one engine.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 60
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 60
SAN Volume Requirements: VPLEX Meta Volume SAN Volume Requirements: VPLEX Meta Volume
• One active VPLEX meta volume per cluster
• Used internally for storing meta data
• Failure impact: does not affect production I/O to existing VPLEX volumes
Meta Volume Best Practices:
• Required capacity: 78 GB or larger
• Recommended: run VPLEX meta volume backup periodically
• General requirements for SAN volumes to be used for metas: Highest possible availability
Not demanding of performance:
Low write I/O ‐ only during configuration changes
High read I/O – only during Director boot and NDU
Listed are the requirements and best practices for VPLEX Meta Volumes. I/O throughput capability is not a serious consideration for a meta volume, since it is updated only during configuration changes. Availability is the overriding concern here. It is critical to mirror the Meta Volume onto two different arrays. An additional recommendation is to create meta volumes on two arrays with different refresh timelines, thus avoiding having to migrate the data off both arrays at once. It is important to periodically make backups of the Meta Volume especially after VPLEX configuration changes or upgrades. The reason for this is to eliminate the possibility of the system from ever losing access to newly created VPLEX objects.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 61
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 61
SAN Volume Requirements: VPLEX Logging Volume SAN Volume Requirements: VPLEX Logging Volume
• Required only in Metro‐Plex: at least one logging volume per cluster
• Used internally to track changes between legs of distributed RAID‐1 devices during loss of connectivity between clusters
• Required capacity: 1 bit for every 4‐Kbyte page of distributed device One 10 GB logging volume can support 320 TB of distributed devices
• General requirements for SAN volumes to be used for logging: Very high performance requirement
No I/O activity on logging volumes under normal conditions
High random, small‐block write I/O rate during loss of connectivity
High small‐block read I/O rate during incremental re‐synchronization
Highest possible availability
Use striped and mirrored volumes to meet these requirements
Listed are the requirements and best practices for VPLEX logging volumes.
A pre‐requisite for creating a distributed device, or a remote device, is that you must have a logging volume at each cluster. Single‐cluster systems and systems that do not have distributed devices do not require logging volumes. Logging volumes keep track of changed blocks during an inter‐cluster link failure. After a link is restored, the system uses the information in logging volumes to synchronize the distributed devices by sending only changed block regions across the link.
The logging volume must be large enough to contain one bit for every page of distributed storage space. So for example, you only need about 10 GB of logging volume space for 320 TB of distributed devices in a Metro‐Plex. The logging volume receives a large amount of I/O during and after link outages. So it must be able to handle I/O quickly and efficiently.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 62
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 62
Storage Views: Best PracticesStorage Views: Best Practices
• Each storage view should have:
At least two registered initiators (HBA ports) from each host
Recommended: HBAs distributed over redundant fabrics
At least two VPLEX FE ports: one from an A director, one from a B director
Recommended: ports from different engines when possible, and distributed over redundant fabrics
• Create one storage view for all the hosts that need access to the same storage
Storage View
V Vol
HostInitiator
HostInitiator FE Port
FE Port
When creating storage views, follow these best practices: Create one storage view for all hosts that need access to the same storage, and then add all required volumes to the view.
Redundancy requirements are based on standard EMC guidelines for SAN configuration. Each host should have at least two registered initiators in the view. Access to the volumes should be enabled via at least two VPLEX front‐end ports in the view. When selecting the front‐end ports for a storage view, make sure to follow the previously‐discussed best practices – use ports from at least one A director and one B director, and whenever possible, from directors in separate engines.
62
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 63
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 63
Partition AlignmentPartition Alignment
• VPLEX page size = 4K
• VMAX track size = 32K
• Minimum recommended alignment = 64K
• Can’t go wrong with 1M
When creating VPLEX virtual volumes, pay attention to partition alignment in order to avoid host‐to‐storage performance problems in production.
Follow these best practices for partition alignment: Best practices that apply to directly‐accessed storage volumes also apply to virtual volumes
I/O operations to a storage device that cross page, track or cylinder boundaries must be minimized – these lead to multiple read or write operations to satisfy a single I/O request
Misaligned partitions can consume additional resources in VPLEX and the underlying storage array(s), leading to less than optimal performance
Align partitions for any x86‐based OS platform
Align partitions on 32 KB boundaries
TitleMonth Year
63
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 64
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 64
VPLEX Encapsulation: Best PracticesVPLEX Encapsulation: Best Practices
• “Data‐in‐place” migration: minimizes downtime
• Best Practices: Claim storage volumes using the application consistent flag
Prevents reconfigurations other than “one‐for‐one” (single extent spanning entire SAN volume)
Ensures that production data does not become unavailable or corrupted
Migrate into VPLEX in phases
Divide migrations by hosts or initiator groups
• Limitation: Capacity of encapsulation target must be an integral multiple of 4‐
Kbytes
Avoid concurrent I/O activity from host to the native array volume, and to the VPLEX encapsulated volume
Here are some of the best practices and requirements for encapsulation.
A storage volume to be encapsulated must have capacity that is an integral multiple of 4 Kbytes. Otherwise, encapsulation will render it inaccessible to the host. they will be inaccessible.
During encapsulation hosts should be allowed to perform I/O to virtual volumes or storage volumes, but not both at the same time – that can cause data corruption.
Migrations should be performed on an initiator group basis. This way any necessary driver updates can be conveniently handled on a host by host basis.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 65
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 65
VPN and MAN‐COM: Best PracticesVPN and MAN‐COM: Best Practices
• Metro‐Plex requirement: distance <= 100 km; FC‐MAN round trip latency < 5 milliseconds
• Supported distance extension technologies: FC over dark fibre; DWDM
• Best Practice:
Two physical MAN links with similar characteristics, such as latency
Configure long‐distance links between VPLEX clusters using ISLs
Redundant MAN fabrics; one connection to each MAN fabric from every VPLEX Director
VPLEX Management Server
Cluster‐1/Site A
Director B
Director A
Director B
Director A
Switch
Switch
VPLEX Management Server
Cluster‐2/Site B
Director B
Director A
Director B
Director A
Switch
Switch
WAN
Engine 2
Engine 1
ISL 1
ISL 2
Engine 2
Engine 1
IPsec Tunnel
The diagram illustrates the requirements for IP and FC connectivity between the two clusters in a Metro‐Plex.
A fundamental requirement – without which the Metro‐Plex cannot be installed – is IP connectivity between the VPLEX Management Servers. As part of initial Metro‐Plex install, a VPN tunnel is established for secure connection and interchange of configuration data between these servers.
Additionally, the VPLEX Directors of each cluster need visibility to Directors of the other cluster via their MAN‐COM ports. Currently distances of up to 100 km between clusters is supported. Round‐trip latency on this link must be less than 5 milliseconds. Bandwidth requirement will obviously depend on the specific customer application; in general a minimum of 45 Mbps is the guideline.
The FC‐MAN links can use either dark fibre or DWDM.
When configuring a Metro‐Plex it is best to make use of two fabrics for the FC‐MAN connection, allowing a Director to communicate with all the other Directors on either of the two fabrics. This provides the best possible performance and fault tolerance.
If MAN traffic must share the same physical link as customer production traffic, then logical isolation must be implemented using VSANs or LSANs.
Note that there are specific zoning practices to be followed when exposing Director FC‐MAN ports to each other. Refer to the product installation guide for details.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 66
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 66
Mobility RecommendationsMobility Recommendations
• Device Mobility Mobility between dissimilar arrays
Relocate hot devices from one array type to another
Relocate devices across clusters in a Metro‐Plex
• Batch Mobility For non‐disruptive tech refreshes and lease rollovers
For non‐disruptive cross‐Plex device mobility
Only 25 devices or extents can be in transit at one time
Additional mobility will be queued if greater than 25
• Extent Mobility Load balance across storage volumes
Listed are some typical applications for each supported type of Mobility.
Extent mobility can be used for load balancing across the storage volumes. This can also be used for array mobility where source and target arrays have a similar configuration, i.e. same number of storage volumes, identical capacities, etc.
Device mobility can be used for data mobility between dissimilar arrays, relocating a “hot” device from one type of storage to another.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 67
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 67
Distributed Devices: Host Connect TopologiesDistributed Devices: Host Connect Topologies
• Local Access Each host accesses volume via FE ports on one cluster only
• Spanned Access (NOT Supported in V4.0) Each host accesses volume via FE ports on both clusters
There are two fundamental models for host access to DR1 volumes in a Metro‐Plex.
With “Local Access”, the fabrics at the two sites remain separate, with hosts at each site accessing DR1 volumes via local VPLEX FE ports only.
With “Spanned Access”, the hosts have access to fabrics at both sites and can therefore access DR1 volumes through FE ports at both sites. This provides for additional resiliency in a stretched host cluster – since with this access model, the host can tolerate loss of an entire VPLEX cluster at either site. Note that Spanned Access is not supported in v4.0.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 68
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 68
Scalability and LimitsScalability and Limits
25Active inter‐cluster rebuilds 25Active intra‐cluster rebuilds
8 PBTotal storage provisioned in a systemUp to 32 TBVirtual volume sizeUp to 32 TBStorage volume size
2RAID‐1 mirror legs78 GBMeta volume size
8000 per clusterVirtual volumes 8000 per clusterStorage‐volumes
400Initiators (HBA ports)24000Extents
Maximum #Parameter
Shown are some key design limits; a complete table of all EMC VPLEX‐related design limits is published in the Release Notes. Always refer to the current version of the product Release Notes for these limits, which are subject to change until GA.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 69
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 69
Volume Limits in a Metro‐Plex: ExampleVolume Limits in a Metro‐Plex: Example
Cluster‐1/Site A
Hosts
6000 “local” volumes
6000 local‐devices
2000 distributed‐devices
2000 local‐devices 2000 local‐devices
2000 “stretched” volumes
6000 “local” volumes
6000 local‐devices
Cluster‐2/Site B
Hosts
Here is an example to illustrate how the maximum limit of 8000 volumes per cluster can be effectively exploited in a Metro‐Plex solution.
In this scenario, we have 2000 distributed devices with the corresponding 2000 “stretched” volumes that can be presented to hosts at both sites. These volumes can potentially be shared by hosts across sites, for example to accommodate distance VMotion or stretched host clustering applications. Note that our 2000 “top‐level” distributed devices (i.e. devices that are enabled for front‐end presentation) are layered upon 2000 local devices within each cluster.
In addition, you can configure up to 6000 more “top‐level” local devices at each site, that are presented to local hosts only. These would be suitable for data that doesn’t need to be shared across sites.
This example shows how to conform to the 8000 volumes per VPLEX cluster limit, while also maximizing the benefit to the customer.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 70
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 70
EMC VPLEX: Solution Design ToolsEMC VPLEX: Solution Design Tools
• Simple Support Matrix (SSM)
• VPLEX Sizing Tool (VST) Currently a calculator to determine cluster size
Plan is to integrate with BCSD in the future
• HEAT Check for host compatibility with VPLEX
• VPLEX Deployment Tool (VDT) Helps to assist with VPLEX
Configurations, implementations, and modifications in VPLEX clusters
Executable that runs on Windows
• SVC Qualifier
These are the current VPLEX solution design tools in active development.
Network quality and latency assessment is recommended.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 71
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 71
VPLEX Sizing ToolVPLEX Sizing Tool
The VPLEX Sizing Tool can be used to validate a proposed VPLEX solution – either single cluster or Metro‐Plex.
It requires basic information about the type of workload, volume count, host initiator count etc.
Given this data, the tool checks whether the proposed design is capable of handling the workload from a performance standpoint;
and also whether it conforms to the complete list of configuration limits, as listed in the Release Notes.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 72
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 72
Simple Support Matrix (SSM)Simple Support Matrix (SSM)
https://elabnavigator.emc.com/emcpubs/elab/esm/pdf/EMC_VPLEX.pdf
• Current VPLEX SSM is downloadable from:
The Simple Support Matrix provides a comprehensive view of current interoperability statements within a compact layout. It will be accessible through eLab Navigator.
Supported operating system base platforms,
Multi‐pathing options,
Volume management options,
And host clustering options are presented here in an easy‐to‐read format, for quick reference.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 73
© 2010 EMC Corporation. All rights reserved. Module 4: Planning and Design Considerations 73
Interoperability: Current LimitationsInteroperability: Current Limitations
• Timefinder/Clone/Snap: NOT Supported at this time
• MirrorView/SRDF: can be used only when target or R2 site volumes are not virtualized with VPLEX ONLY support 1:1 mapping between VPLEX virtual volume and array
physical volume because the remote site (target/R2) won't be virtualized
• Currently VPLEX support only thick‐to‐thick data moves Virtual provisioning and support for thick‐to‐thin non‐disruptive mobility
in VPLEX are planned to be added over time
• RecoverPoint: not integrated and supported with VPLEX
Shown are some of the key interoperability limitations at launch time.
In v4.0, Timefinder/Clone/Snap is not supported.
MirrorView/SRDF can be used on VPLEX backend as long as the target or R2 site volumes are not virtualized with VPLEX. This also means that we can ONLY support 1:1 mapping between VPLEX virtual volume and array physical volume because the remote site (target/R2) won't be virtualized.
In v4.0, VPLEX will support only thick‐to‐thick data moves. Virtual provisioning and support for thick‐to‐thin non‐disruptive mobility in VPLEX are planned to be added over time.
RecoverPoint is not integrated and supported with v4.0. This functionality will be added over time.
VPLEX Architecture and Design
Copyright © 2010 EMC Corporation. Do not Copy ‐ All Rights Reserved. 74
© 2010 EMC Corporation. All rights reserved. VPLEX Architecture and Design 74
Course SummaryCourse Summary
• EMC VPLEX represents innovative local and distributed federation technology. It is positioned to address non‐disruptive workload relocation, distributed data access, workload resiliency and simplified storage management.
• VPLEX Local supports local federation including consolidation, heterogeneous pooling and non‐disruptive mobility within a data center.
• VPLEX Metro supports the above, as well as distributed federation across sites or failure domains, within synchronous distances (up to 100 km, latency < 5 msec).
• VPLEX offers AccessAnywhere with the key enablers including: distributed virtual volumes over distance, remote access, and mobility within and across clusters.
This concludes the instructional portion of this training. These are the key points that have been covered in this course.
Please proceed to take the assessment.