Vplex Metro Solutions Whitepaper

37
VBLOCK™ DATA PROTECTION: BEST PRACTICES FOR EMC VPLEX WITH VBLOCK SYSTEMS July 2012 © 2012 VCE Company, LLC. All Rights Reserved. www.vce.com

Transcript of Vplex Metro Solutions Whitepaper

Page 1: Vplex Metro Solutions Whitepaper

Best Practices for EMC VPLEX with Vblock Systems Table of Contents

1 © 2012 VCE Company, LLC. All Rights Reserved.

5

VBLOCK™ DATA PROTECTION: BEST PRACTICES FOR EMC VPLEX WITH VBLOCK SYSTEMS

July 2012

© 2012 VCE Company, LLC. All Rights Reserved.

www.vce.com

Page 2: Vplex Metro Solutions Whitepaper

2 © 2012 VCE Company, LLC. All Rights Reserved.

Contents Introduction ...............................................................................................................................4

Business case..........................................................................................................................4 Solution ...................................................................................................................................4 Features and benefits ..............................................................................................................5 Scope ......................................................................................................................................5 Audience..................................................................................................................................5 Feedback .................................................................................................................................6 Terminology .............................................................................................................................6

Technology overview ................................................................................................................8 VCE Vblock™ Systems ...........................................................................................................8 VMware vCenter Server ...........................................................................................................8 VMware vCenter Server Heartbeat ..........................................................................................9 VMware vSphere HA ...............................................................................................................9 VMware vMotion ......................................................................................................................9 EMC VPLEX ..........................................................................................................................10

VPLEX Local ......................................................................................................................10 VPLEX Metro ......................................................................................................................10 VPLEX Witness ..................................................................................................................10

Use Cases ................................................................................................................................11 Business continuity ................................................................................................................11 Workload and data mobility ....................................................................................................11

Deployment guidelines and best practices ...........................................................................12 Planning the deployment .......................................................................................................12 VPLEX to Vblock system mappings .......................................................................................13 VPLEX port requirements per connected Vblock system .......................................................13 Best practices for host front-end connectivity .........................................................................14 Deploying VPLEX Witness .....................................................................................................15 Monitoring VPLEX .................................................................................................................15

Power and environmental monitoring ..................................................................................15 Event logging ......................................................................................................................15 Daily overall health monitoring ............................................................................................16 Performance monitoring .....................................................................................................16

Configuring metadata and logging volumes ...........................................................................17

Page 3: Vplex Metro Solutions Whitepaper

3 © 2012 VCE Company, LLC. All Rights Reserved.

Best practices for metadata volumes ..................................................................................17 Best practices for logging volumes .....................................................................................17

Virtual networking high availability .........................................................................................18 Using host affinity groups .......................................................................................................18 Compatibility guidelines for VPLEX with UIM/P ......................................................................19 Oracle Real Application Clusters ...........................................................................................19

Recommended configurations ...............................................................................................20 Non Cross-Cluster Connect configurations ............................................................................20

Non Cross-Cluster Connect without VPLEX Witness ..........................................................21 Non Cross-Cluster Connect with VPLEX Witness ...............................................................22

Cross-Cluster Connect configuration .....................................................................................23 Shared VPLEX configuration .................................................................................................25 Data flow................................................................................................................................26

Failure scenarios ....................................................................................................................27 Application and management failover ....................................................................................27 Application and management failback ....................................................................................27 Failure scenarios ...................................................................................................................28

Non Cross-Cluster Connect with Witness failure scenarios .................................................28 Cross-Cluster Connect with Witness failure scenarios ........................................................31

Conclusion ..............................................................................................................................35 Next steps ..............................................................................................................................35

Additional references .............................................................................................................36 VCE .......................................................................................................................................36 VMware .................................................................................................................................36 EMC ......................................................................................................................................36

Page 4: Vplex Metro Solutions Whitepaper

4 © 2012 VCE Company, LLC. All Rights Reserved.

Introduction

Today, more and more enterprises are virtualizing their business-critical applications to deliver the most value back to their businesses. In a virtualized environment, where physical servers host many virtual servers, the volume of data and the speed of change require new techniques and methods for:

Protecting essential data to ensure business continuity

Moving and relocating applications and data to accommodate dynamic workloads

This paper contains best practices for deploying EMC VPLEX with Vblock™ Systems for a reliable and predictable business continuity and workload mobility solution. VPLEX is a critical element of Vblock Data Protection, a family of data protection solutions that leverages technology from EMC, Cisco, and VMware.

Business case

Business continuity and workload mobility are key IT and business objectives. Downtime of important applications is a costly proposition and extended downtime can be disastrous to any business. However, challenges such as high complexity, high costs, and unreliable solutions have limited the ability of organizations to implement effective business continuity plans.

True business continuity requires much more than just point-to-point replication. To be most cost-effective, applications running on virtual servers in one location must be able to migrate to and operate on virtual servers in remote locations without stopping and restarting. These applications must be able to continue to access data regardless of physical location.

Finding an agile, non-disruptive method to move applications and their data within and between data centers to balance workloads, maintain systems, and consolidate resources presents a significant challenge to IT organizations. Traditionally, organizations performed a series of manual tasks and activities to transfer applications and data to an alternate location. IT staff either made physical backups or use data replication services. Applications were stopped and could not be restarted until testing and verification were completed.

Solution

To meet the business challenges presented by today’s on-demand 24/7 world, virtual workloads must be highly available and mobile—in the right place, at the right time, and at the right cost to the enterprise.

EMC VPLEX is an innovative business continuity and workload mobility solution that can easily move business applications within or between Vblock systems located in the same data center or metropolitan area. VPLEX can also mirror business-critical data, deliver zero data loss, and ensure automatic near-zero application recovery time. Working in conjunction with VMware vMotion, VPLEX is a hardware and software solution for Vblock systems that provides enhanced availability for business continuity and dynamic workload mobility.

Page 5: Vplex Metro Solutions Whitepaper

5 © 2012 VCE Company, LLC. All Rights Reserved.

VPLEX solves many of the business continuity and workload mobility challenges facing enterprises today. With VPLEX, organizations can:

Eliminate planned downtime due to maintenance activities

Automatically handle any unplanned events

Load balance between sites to drive higher asset utilization

VMware vMotion leverages the virtualized converged infrastructure of the Vblock system to move an entire running virtual machine instantaneously from one server to another. VMware Dynamic Resource Scheduler (DRS) uses vMotion to continuously monitor utilization across resource pools and intelligently align resources with business needs. VMware High Availability (HA) ensures application restart after server or complete site failure.

Features and benefits

Deploying VPLEX with the Vblock system provides many benefits including:

Distributed storage federation—Achieve transparent mobility and access within, between, and across data centers

EMC AccessAnywhere—Share, access, and relocate a single copy of data over distance

Scale-out cluster architecture—Start small and grow larger with predictable service levels

Advanced data caching—Improve I/O performance and reduce storage array contention

Distributed cache coherence—Automate sharing, balancing, and failover of I/O across clusters

Mobility—Migrate and relocate virtual machines, applications, and data

Resilience—Reduce unplanned application outages between sites

Scope

To help customers choose the VPLEX configuration most suitable for their specific business objectives and environment, this paper presents:

Descriptions of the key use cases for deploying VPLEX Local or Metro with the Vblock system

VPLEX deployment options for business continuity and data mobility

Guidelines and best practices for deploying VPLEX with the Vblock system

Detailed instructions for installing and configuring VPLEX with the Vblock system are not included. Refer to Additional References for a list of the appropriate installation and administration guides.

Audience

This paper will be of particular interest to system, application, database, and storage architects; VCE and EMC vArchitects; and anyone interested in deploying a VPLEX solution for Vblock systems.

Page 6: Vplex Metro Solutions Whitepaper

6 © 2012 VCE Company, LLC. All Rights Reserved.

Feedback

To suggest documentation changes and provide feedback on this paper, send email to [email protected]. Include the title of this paper and the name of the topic to which your feedback applies.

Terminology

Term Definition

Block storage Data structured as blocks. A block is a sequence of bytes or bits having a nominal length (block size). The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data stream by the computer program receiving the data. Blocked data are normally read a whole block at a time. Virtual volumes in VPLEX are presented to users as a contiguous list of blocks.

Business continuity The effective flow of essential business functions during and after a disaster. Business continuity planning develops processes and procedures to prevent interruption of mission-critical services, and to reestablish full functioning as swiftly and smoothly as possible.

Distributed virtual volume A VPLEX virtual volume with complete, synchronized copies of data (mirrors), exposed through two geographically separated VPLEX clusters. Servers at distant data centers can access distributed virtual volumes simultaneously thus allowing vMotion over distance.

EMC AccessAnywhere The enabling technology that underlies the ability of VPLEX to provide access to information between clusters separated by distance.

GeoSynchrony The operating system running on VPLEX directors. GeoSynchrony is an intelligent, multitasking, and locality-aware operating environment that controls the data flow for virtual storage.

High availability A system-design approach and associated service implementation that ensures a pre-arranged level of operational performance will be met during a contractual measurement period.

Latency An amount of elapsed time. In this document, latency may refer to the time required to fulfill an I/O request or to the round-trip time (RTT) required to send a message over a network and back.

Management server Each VPLEX cluster has one management server. The management server provides the connectivity to the customer’s IP network and serves as the management access point for the VPLEX cluster.

Recovery Point Objective (RPO)

The maximum amount of data that can be lost in a given failure event.

Recovery Time Objective (RTO)

The duration of time within which a business process must be restored after a disaster (or disruption) to avoid unacceptable consequences associated with a break in business continuity.

Virtual volume The topmost device within the VPLEX I/O stack that can be presented to a host or multiple hosts.

VPLEX cluster Two or more VPLEX directors forming a single fault-tolerant cluster.

Page 7: Vplex Metro Solutions Whitepaper

7 © 2012 VCE Company, LLC. All Rights Reserved.

Term Definition

VPLEX director A CPU module that runs GeoSynchrony, the core VPLEX operating environment. There are two directors in each engine, and each has dedicated resources and is capable of functioning independently.

VPLEX engine A VPLEX enclosure that contains two directors, management modules, and redundant power to ensure high availability and no single point of failure.

Workload In the context of this document, virtual machines and their corresponding applications and storage.

Workload federation Dynamically distributing or balancing the workloads as effectively as possible, regardless of physical location, while optimizing business and operational goals.

Page 8: Vplex Metro Solutions Whitepaper

8 © 2012 VCE Company, LLC. All Rights Reserved.

Technology overview

The VPLEX solution for Vblock systems uses the following key hardware and software components and technologies:

VCE Vblock Systems

VMware vCenter Server

VMware vCenter Server Heartbeat

VMware vSphere High Availability (HA)

VMware vMotion

EMC VPLEX

VCE Vblock™ Systems

Vblock systems combine industry-leading compute, network, storage, virtualization, and management technologies into prepackaged units of infrastructure. Through the standardization of building blocks, the Vblock system dramatically simplifies IT operations—accelerating deployment while reducing costs and improving service levels for all workloads, including the most demanding and mission-critical enterprise applications.

Vblock systems scale to deliver the right performance and capacity to match the needs of business applications. The following Vblock systems are available:

• Vblock Series 300 is designed to address a wide spectrum of virtual machines, users, and applications and is ideally suited to achieve the scale required in private and public cloud environments. Vblock 300 scales from smaller- to mid-sized enterprise Customer Relationship Management (CRM), Supply Chain Management (SCM), e-mail, file and print, and collaboration deployments.

Vblock Series 700 is designed for deployments involving very large numbers of virtual machines and users and is ideally suited to meet the higher performance and availability requirements of critical business applications. Vblock 700 scales to the largest deployments of enterprise CRM and SCM, data center operations, and service provider cloud computing offerings.

For more information on Vblock systems, refer to the Vblock Infrastructure Platforms Technical Overview.

VMware vCenter Server

VMware vCenter Server provides a scalable and extensible platform that forms the foundation for virtualization management. It centrally manages VMware vSphere environments, allowing IT administrators control over the virtual environment. One vCenter Server controls the primary and secondary Vblock systems in the VPLEX deployment. Each Vblock system contains a copy of vCenter Server in case one fails. These vCenter Server instances must be identical.

For more information on VMware vCenter Server, go to the VMware vCenter Server web page.

Page 9: Vplex Metro Solutions Whitepaper

9 © 2012 VCE Company, LLC. All Rights Reserved.

VMware vCenter Server Heartbeat

VMware vCenter Server Heartbeat delivers high availability for vCenter Server by protecting the virtual and cloud infrastructure from application, configuration, operating system, network, and hardware-related problems.

Heartbeat is a clustering solution with primary and secondary nodes operating in active-passive mode. Heartbeat keeps the vCenter Server instances at each site synchronized. Changes to the vCenter Server configuration at one site are reflected on the other site.

For more information on vCenter Heartbeat, refer to the VMware vCenter Server Heartbeat Administrator Guide.

Note: In a VPLEX deployment, a VMware vCenter Heartbeat license is required for each instance of vCenter Server being protected with Heartbeat.

VMware vSphere HA

VMware vSphere HA is an easy-to-use, cost-effective feature for ensuring continuous operation of applications running on virtual machines. HA continuously monitors all virtualized servers in a resource pool and detects failures in the physical server and operating system. In the event of physical server failure, affected virtual machines are restarted automatically on other production servers with spare capacity. In the case of operating system failure, HA restarts the affected virtual machine on the same physical server. When combined with VPLEX distributed storage, HA also provides full automatic recovery from a complete site disaster.

VMware vMotion

Included with VMware vSphere, VMware vMotion enables live migration of virtual machines from one physical server to another while continuously powered up. This process takes place with no noticeable effect from the end user’s point of view. An administrator can take a virtual machine offline for maintenance or upgrading without subjecting the system's users to downtime. Migrating a virtual machine with VMware vMotion preserves its precise execution state, network identity, and the active network connections. As a result, there is zero downtime and no disruption to the user.

Combined with Vblock systems and VPLEX, vMotion enables effective distribution of applications and their data across multiple virtual hosts within synchronous distances. With virtual storage and virtual servers working together over distance, the infrastructure can provide load balancing, realtime remote data access, and improved application protection.

VMware vMotion is the key technology that underpins VMware DRS. DRS continuously monitors the pooled resources of many servers and intelligently allocates available resources among virtual machines based on pre-defined rules that reflect business needs and priorities. The result is a self-managing, highly optimized, and efficient IT environment with built-in, automated load balancing.

For more information on VMware vMotion, refer to Workload Mobility with VMware vMotion and EMC VPLEX on Vblock Platforms.

Page 10: Vplex Metro Solutions Whitepaper

10 © 2012 VCE Company, LLC. All Rights Reserved.

EMC VPLEX

VPLEX is an enterprise-class, storage federation technology that aggregates and manages pools of Fibre Channel (FC)-attached storage within and among data centers. VPLEX resides between the servers and the FC-attached storage, and presents local and distributed volumes to hosts.

VPLEX enables dynamic workload mobility and continuous availability within and between Vblock systems over distance. It provides simultaneous access to storage devices at two sites through the creation of VPLEX distributed virtual volumes, supported on each side by a VPLEX cluster.

For more information on VPLEX, refer to the EMC VPLEX 5.0 Architecture Guide.

VPLEX Local

VPLEX Local provides seamless, non-disruptive data mobility and the ability to manage multiple heterogeneous arrays from a single interface within a data center. VPLEX Local allows increased availability, simplified management, and improved utilization across multiple arrays.

VPLEX Metro

VPLEX Metro with EMC AccessAnywhere delivers distributed federation and enables active/active block-level access to data between two sites within synchronous distances of up to a 5-millisecond round-trip time. VPLEX Metro, in combination with VMware vMotion, allows transparent movement and relocation of virtual machines and their corresponding applications and data over distance. AccessAnywhere enables a single copy of data to be shared, accessed, and relocated over distance.

VPLEX Witness

VPLEX Witness is an optional, but highly recommended, component designed for deployment in customer environments where the regular bias rule sets are insufficient to provide seamless zero or near-zero Recovery Time Objective (RTO) failover in the event of site disasters and VPLEX cluster failures.

By reconciling its own observations with the information reported periodically by the clusters, VPLEX Witness enables the cluster(s) to distinguish between inter-cluster network partition failures and cluster failures and to resume I/O automatically in these situations.

For more information, refer to the EMC VPLEX Metro Witness Technology and High Availability TechBook.

Page 11: Vplex Metro Solutions Whitepaper

11 © 2012 VCE Company, LLC. All Rights Reserved.

Use Cases

The two most common use cases for deploying VPLEX with the Vblock system are business continuity and workload mobility. Organizations rely on the continuity of their data centers as an essential part of their business. In addition, as data centers become more geographically dispersed, IT organizations need to be able to dynamically and non-disruptively move workloads from one physical location to another.

Business continuity

Business continuity, sometimes referred to as disaster avoidance, means keeping the business-critical applications operational, even after a disaster. Business continuity describes the processes and procedures an organization establishes to ensure that essential functions can continue during and after a disaster. Business continuity planning seeks to prevent interruption of mission-critical services, and to reestablish full functioning as swiftly and smoothly as possible by using an automated process with zero data loss and near-zero recovery time.

VPLEX, in combination with Vblock systems, facilitates business continuity through the creation of distributed virtual volumes, which are storage volumes located in two separate Vblock systems. These virtual volumes are 100% in sync at all times. The VPLEX solution provides the dynamic storage infrastructure required to migrate applications, virtual machines, and data within and between remote locations with no disruption of service.

VPLEX provides data access from the primary and target sites in an active-active mode, eliminating the need to move the underlying storage and making migration dramatically faster. AccessAnywhere, the distributed cache coherent technology in VPLEX, enables simultaneous read/write data access.

Properly configured, VPLEX delivers a zero RPO and a near-zero RTO.

Workload and data mobility

The combination of the converged infrastructure of the Vblock system, VMware vMotion, and VPLEX allows administrators to relocate virtual machines along with the corresponding applications and storage. Data centers can now pool capacity to improve infrastructure utilization, refresh technology, or load balance within or across data centers. The data centers can also enhance Service Level Agreements (SLAs) by providing high availability, increased resiliency, and business continuity for critical applications and data. Workload and data mobility with VPLEX can be automatic with DRS, or manual with vMotion.

For more information about workload and data mobility, refer to Workload Mobility with VMware vMotion and EMC VPLEX Metro on Vblock Infrastructure Platforms.

Page 12: Vplex Metro Solutions Whitepaper

12 © 2012 VCE Company, LLC. All Rights Reserved.

Deployment guidelines and best practices

The amount of data and the data change rate of the storage volumes that require business continuity protection determine the specific VPLEX configuration required. The following sections provide deployment best practices and guidelines to help choose the best configuration for the environment and business needs.

Planning the deployment

Deployment planning is a critical step in the successful implementation of the VPLEX with Vblock system solution. Each enterprise has its own set of goals, requirements, and priorities to consider.

Table 1 lists deployment prerequisites and guidelines to consider before beginning deployment.

Table 1. Deployment prerequisites and guidelines

Item Activity/Guideline

Applications Determine: Which applications and storage volumes VPLEX will manage Data change rates The business’s application priorities Note: In the event of a site disaster, restart all failed applications on the

other site. Define the priority of applications, that is, which ones are the most critical. Use VMware Startup Priority to prioritize and get the most critical applications back online first.

Recovery plan Develop and test a recovery plan.

Bandwidth Ensure the following: The Layer-2 network extends across the sites. There is adequate bandwidth between sites for the data change rate. If using vMotion, allow for 622 Mb/sec (5-millisecond maximum round-trip

time) of bandwidth above VPLEX bandwidth requirement.

Disk space For a business continuity deployment with near-zero RTO, protected data and applications reside on both sites through distributed virtual volumes. Allow for adequate disk space on both sites for this duplication.

Allow adequate disk space for metadata and logging. Refer to Metadata and logging volumes for more information.

VMware vCenter instance consistency

VMware vCenter Heartbeat synchronizes changes made on either site to the other. When using Heartbeat, ensure that the vCenter Server instances on the Vblock systems at both sites are identical and not installed on a distributed device. Note: In a VPLEX deployment, a VMware vCenter Heartbeat license is

required for each instance of vCenter Server being protected with Heartbeat.

Page 13: Vplex Metro Solutions Whitepaper

13 © 2012 VCE Company, LLC. All Rights Reserved.

Item Activity/Guideline

Site hardware/software resources

Protected applications can be distributed across the two sites. The workload split need not be 50/50. However, for business continuity to work, each site must be able to run 100 percent of the combined workload from both sites. Site A must have enough resources to run 100 percent of the VPLEX

protected applications. Site B must have enough resources to run 100 percent of the VPLEX

protected applications.

VPLEX to Vblock system mappings

Table 2 maps Vblock system models to VPLEX cluster configurations. In this table, it is assumed that only one Vblock system is connected to one VPLEX cluster. For example, a single Vblock 700LX can be connected to a single-engine or dual-engine VPLEX cluster, but not to a quad-engine VPLEX cluster.

Table 2. VPLEX to Vblock Platform Mappings

Vblock Platform

VPLEX Cluster Configurations

Single-Engine Dual-Engine Quad-Engine

300 series Yes Yes No1

700LX Yes Yes No1

700MX Yes Yes Yes 1 The Vblock Series 300 and Vblock 700LX have only 16 ports available for VPLEX front-end and back-end connections.

VPLEX port requirements per connected Vblock system

The following table shows the range of FC connections for each connected Vblock system. The specific number of connections required is determined during the VPLEX sizing effort.

Two separate physical WAN links should be used to connect the two VPLEX clusters.

Note: One or two Vblock systems can be connected to each VPLEX cluster. If the Vblock systems are Cross-Cluster Connected, then only one Vblock system can be connected to the VPLEX cluster.

Page 14: Vplex Metro Solutions Whitepaper

14 © 2012 VCE Company, LLC. All Rights Reserved.

Table 3. VPLEX Port Requirements per Connected Vblock Platform

FC Connections One Engine 2 Directors

Two Engines 4 Directors

Four Engines 8 Directors

Comments

VPLEX front-end (UCS-facing ports)

4

8 16 Read/write I/O from VPLEX

protected applications

VPLEX back-end (Storage-facing ports)

4 8 16 Application read/writes and

changes mirrored from site A to site

B

4 GB or 8 GB WAN ports for VPLEX

communications traffic

(10 GbE can also be used)

4 8 16 Sizing based on performance requirements

Best practices for host front-end connectivity

Best practices for host front-end connectivity are:

All shared data volumes for a single ESX cluster should be either on the VPLEX datastore or not on the VPLEX datastore.

The ESX blades access the SAN Boot LUNs directly without going through the VPLEX.

The SAN Boot LUNs and virtualized shared data LUNs for a single ESX cluster utilize the same four array front-end ports. Array front-end ports are not dedicated or reserved for virtualized volumes (VPLEX back-end access).

For ease of management and support, configure a single VSAN on the SAN fabric switches that contain the host initiators, storage array front-end port initiators, and the VPLEX engine back-end and front-end ports.

Cross-connect scenarios may require an additional VSAN to ensure the primary data VSAN does not merge with the remote primary data VSAN across the intersite link (ISL).

Zone VPLEX back-end ports only to array FA ports to which virtualized data volumes are mapped. VPLEX maintains the currently recommended ESX cluster to FA port mappings documented in the Vblock Systems Logical Build Guide available on the VCE Technical Documentation website (http://vblockproductdocs.ent.vce.com/) for detailed information about reserved ports.

Ports on the storage arrays are reserved for add-on components and features such as RecoverPoint, data migration, VG Gateways (Vblock Series 700 only), and SAN backup. Do not use these array front-end ports for VPLEX or direct host access. Refer to the appropriate Vblock Systems Logical Build Guide and Port Assignment Reference available on the VCE Technical Documentation website (http://vblockproductdocs.ent.vce.com/) for detailed information about reserved ports.

Page 15: Vplex Metro Solutions Whitepaper

15 © 2012 VCE Company, LLC. All Rights Reserved.

Deploying VPLEX Witness

An external VPLEX Witness server is installed as a virtual machine running on a customer-supplied VMware ESXi host deployed in a failure domain separate from either VPLEX clusters. VPLEX Witness connects to both VPLEX clusters over the IP management network.

If two Vblock systems are connected to each VPLEX cluster in a VPLEX deployment, a separate VPLEX Witness is required for each Vblock system-VPLEX cluster pairing.

For more information, refer to the EMC VPLEX Metro Witness Technology and High Availability TechBook.

Monitoring VPLEX

Various tools and techniques are available to monitor performance and to identify and diagnose problems on VPLEX systems.

Power and environmental monitoring

A GeoSynchrony service performs the overall health monitoring of the VPLEX cluster and provides environmental monitoring for the VPLEX cluster hardware. It monitors various power and environmental conditions at regular intervals and logs any condition changes into the VPLEX messaging system.

All component failures that occur within a VPLEX system are reported through events that call back to the EMC Service Center to ensure timely response and repair of these fault conditions.

Event logging

The VPLEX cluster provides event logs and call home capability by means of EMC Secure Remote Support (ESRS).

VPLEX includes services, processes, components, and operating systems that write entries to various logs. Logs are collected for:

Scheduled activities: SYR collection

On-demand utilities: collect-diagnostics

Call home events

Event messages notify users of changing conditions under which the system is operating. Depending on their severity, these messages may generate a call home.

Refer to Best practices for logging volumes for additional information.

Page 16: Vplex Metro Solutions Whitepaper

16 © 2012 VCE Company, LLC. All Rights Reserved.

Daily overall health monitoring

Once configured and operational, the VPLEX system does not require ongoing administration or tuning. However, VCE strongly recommends a daily health check to ensure problems are caught before they have a negative impact.

To check the overall health of the VPLEX system:

1. Check the System Status dashboard in the management console to see high-level health indications.

2. Use the following CLI commands:

Command Description

validate-system-configuration Performs a basic system configuration check

cluster status Displays a cluster's operational status and health-state

export storage-view summary Lists each storage view, and the number of volumes and initiators that it contains (identifies failed devices)

connectivity show Displays the communication protocol endpoints that can see each other

export port summary Summarizes any unhealthy ports

For more information about using the VPLEX CLI, refer to the EMC VPLEX CLI Guide (P/N 300-012-311), available on the EMC Powerlink website at http://Powerlink.EMC.com. Registration is required.

Performance monitoring

The Performance Monitoring dashboard in the management console provides a customized view into the performance of the VPLEX system. Performance information for the current 5-minute window is displayed as a set of charts. Check the dashboard to make sure that:

Current performance is within thresholds for WAN and CPU utilization. CPU utilization should not exceed 50 percent.

Front-end and back-end latency statistics are normal

Front-end and back-end latency are the most important performance metrics to monitor. The Performance Monitoring dashboard shows the average front-end latency and average back-end latency for the VPLEX system in graphical form over time. If there is a latency issue, this data will indicate the cause:

If the difference between average front-end latency and average back-end latency is small, investigate the problem within the Vblock storage.

If the difference between average front-end latency and average back-end latency is large, investigate the problem within the VPLEX cluster.

Page 17: Vplex Metro Solutions Whitepaper

17 © 2012 VCE Company, LLC. All Rights Reserved.

Configuring metadata and logging volumes

VPLEX stores configuration and metadata on system volumes created from storage devices. The two types of system volumes are metadata volumes (also referred to as meta-volumes) and logging volumes.

For detailed information about system volumes, refer to EMC VPLEX GeoSynchrony Release 5.1 Administration Guide (P/N 300-013-919-01), available on the EMC Powerlink website at http://Powerlink.EMC.com. Registration is required.

Best practices for metadata volumes

VPLEX metadata includes virtual-to-physical mappings, data about devices, virtual volumes, and system configuration settings. Metadata is stored in the cache and backed up on specially designated external volumes. Performance is not critical for metadata volumes, but availability of metadata volumes is essential for system recovery.

The best practices for configuring metadata volumes on a VPLEX are:

For each VPLEX cluster, allocate four storage volumes of at least 80 GB as metadata volumes.

Configure the metadata volumes for each cluster with multiple back-end storage volumes provided by different storage arrays of the same type.

Use RAID 6 or RAID 5 for metadata volumes. The data protection capabilities provided by these storage arrays ensure the integrity of the system's metadata.

Create backup copies of the metadata whenever configuration changes are made to the system.

Perform regular backups of the metadata volumes on storage arrays that are separate from the arrays used for metadata volumes.

Best practices for logging volumes

During and after link outages, logging volumes are subject to high levels of I/O. Thus, logging volumes must be able to service I/O quickly and efficiently.

The best practices for configuring logging volumes on a VPLEX are:

Create one logging volume for each cluster.

Use RAID 10 for logging volumes. The data protection capabilities provided by the storage array ensure the integrity of the logging volumes.

Configure at least 1GB of logging volume space for every 16TB of distributed device space. Slightly more space is required if the 16TB of distributed storage is composed of multiple distributed devices because a small amount of non-logging information is also stored for each distributed device.

Page 18: Vplex Metro Solutions Whitepaper

18 © 2012 VCE Company, LLC. All Rights Reserved.

Virtual networking high availability

A Cisco Nexus 1000V distributed virtual switch manages virtual networking for the Vblock system. It provides a common management model for physical and virtual network infrastructures that include policy-based virtual machine connectivity, mobility of virtual machine security and network properties, and a non-disruptive operational model.

For VPLEX clusters, VCE recommends that the Virtual Supervisor Modules (VSMs) of the Nexus 1000V are moved from the Advanced Management Pod (AMP) to the UCS blades in the Vblock system. As a result, VPLEX virtual volumes protect active and standby VSMs and ESXi cluster HA restarts the VSMs automatically in the event of a disaster.

Using host affinity groups

Each distributed virtual volume has a preferred VPLEX cluster based on the detach rule configured for it. Under normal operation conditions, virtual volumes are available at both VPLEX clusters. However, in many failure cases, the virtual volumes are available only at the preferred VPLEX cluster.

Therefore, it is recommended that applications run at the VPLEX cluster preferred by the virtual volumes the application is using in the event of any scenario that invokes the VPLEX preference rule (such as a WAN partition). At the same time, the flexibility to have virtual machines move to the non-preferred site in case of failures or load spikes is desirable.

VMware DRS host affinity rules can be used to ensure that virtual machines are always running in the location that the storage they rely on is biased toward.

For example, hosts and virtual machines might be organized into groups A and B. VM group A is configured to run on host group A whenever possible. Host group A contains the UCS blades in one Vblock system (Vblock1) and Host group B contains the UCS blades in the other Vblock system (Vblock2).

Any virtual machines relying on datastores for which the underlying virtual volume is preferred in Vblock1 is put in VM group A. Any virtual machines relying on datastores that have Vblock2 as preferred is put in VM group B. The host affinity rule can then specify that whenever possible, VM group A should run on host group A, and VM group B should run on host group B.

In this way, the virtual machines stay in the location where they have the highest possible availability, but maintain the ability to move to the other location if the preferred location is unable to host them.

For more information, refer to the EMC VPLEX Metro Witness Technology and High Availability TechBook.

Page 19: Vplex Metro Solutions Whitepaper

19 © 2012 VCE Company, LLC. All Rights Reserved.

Compatibility guidelines for VPLEX with UIM/P

EMC Ionix Unified Infrastructure Manager/Provisioning (UIM/P) provides simplified management for Vblock systems, including provisioning, configuration, change, and compliance management. UIM/P offers a consolidated dashboard view, policy-based management, automated deployment, and deep visibility across the environment.

Because UIM/P does not yet support discovery and management of VPLEX hardware, VPLEX cannot be used on a storage volume (or storage pool) allocated by UIM/P. Observing the following guidelines will help avoid conflicts with UIM/P:

In deployments that will include VPLEX, do not allocate 100% of the storage for the Vblock system with UIM/P. Reserve an adequate amount of storage for VPLEX virtual volumes. The volumes can be reallocated with UIM/P later if they are no longer needed for VPLEX.

If 100% of the storage for the Vblock system is allocated with UIM/P already, the Vblock system must have room to add more (new) storage, and the added storage cannot be allocated with UIM/P.

UIM/P displays alarms when it sees storage that it has not allocated and zone members that it has not defined. These alarms should be ignored. Trying to fix them may cause an operational error. This only happens in Cross-Cluster Connect deployments.

Array migration is not supported under VPLEX because service boot disks come directly from the array and do not flow through VPLEX.

VPLEX storage should come from a separate, ungraded storage pool or disk group. If UIM/P has not graded the storage, it does not try to manage the storage and does not set alarms when storage is allocated from it.

If using VPLEX with multiple Vblock platforms, each Vblock platform must have its own UIM/P to be managed separately even if there is only one vCenter Server.

Oracle Real Application Clusters

EMC AccessAnywhere clustering technology allows read/write access to distributed volumes across distance where the volumes have the same SCSI LUN identity. This technology allows hypervisors to migrate virtual machines across distance and application clusters such as Oracle Real Application Clusters (RAC) to provide high availability across distance.

For more information about VPLEX Metro working in conjunction with Oracle RAC on the Vblock system, refer to Oracle Extended RAC With EMC VPLEX Metro Best Practices Planning.

Page 20: Vplex Metro Solutions Whitepaper

20 © 2012 VCE Company, LLC. All Rights Reserved.

Recommended configurations

All of the following configuration examples assume the following:

The term, site, refers to a location. Sites can be rooms or floors in a single building, multiple buildings in a campus environment, or data centers separated by distance.

Both sites have access to 100% of the applications and data protected by VPLEX Metro through distributed virtual volumes.

Site A is running 50% of the VPLEX Metro protected applications, but must have enough VMware resources to run 100% of the applications.

Site B is running 50% of the VPLEX Metro protected applications, but must have enough VMware resources to run 100% of the applications.

In the case of a site failure, there will be zero data loss, but there will be a ±1 minute delay while the virtual guests start up on the other site.

Non Cross-Cluster Connect configurations

A Non Cross-Cluster Connect configuration:

Delivers zero RPO in all single failures, including that of an entire site

Delivers zero RTO in the event of a storage array failure

For a server domain failure or VPLEX failure, limits RTO to the application restart time

Requires a ping round-trip time of less than 5 milliseconds

Page 21: Vplex Metro Solutions Whitepaper

21 © 2012 VCE Company, LLC. All Rights Reserved.

Non Cross-Cluster Connect without VPLEX Witness

This example shows a Non Cross-Cluster Connect configuration without VPLEX Witness. In this configuration, recovery processes will need to be started manually.

Page 22: Vplex Metro Solutions Whitepaper

22 © 2012 VCE Company, LLC. All Rights Reserved.

Non Cross-Cluster Connect with VPLEX Witness

This example shows a Non Cross-Cluster Connect configuration where VPLEX Witness resides on an ESXI host at a third site (Site 3). In this configuration, if either Site 1 or Site 2 fails, VPLEX Witness can automatically start recovery processes. The third site (Site 3) must be a location outside of the failure domain. For example, if the objective is to protect against a fire, VPLEX Witness needs to be outside the fire zone. If the objective is to protect against an earthquake, VPLEX Witness must be outside the earthquake zone.

Page 23: Vplex Metro Solutions Whitepaper

23 © 2012 VCE Company, LLC. All Rights Reserved.

Cross-Cluster Connect configuration

Servers in a Cross-Cluster Connect configuration can access data from the VPLEX at either site. Cross-Cluster Connect is suitable for deploying within a campus, or in multiple isolated zones within a single data center. This configuration eliminates the need to do server failover when an entire storage cluster (VPLEX or the storage behind it) goes down.

A Cross-Cluster Connect configuration:

Delivers zero RPO in all single failures, including that of an entire site

Delivers zero RTO in the event of a storage domain failure

For a server domain failure, limits RTO to the application restart time

Requires a ping round-trip time of less than 1 millisecond

Refer to the EMC VPLEX Metro Witness Technology and High Availability TechBook for detailed descriptions of these failure scenarios.

Page 24: Vplex Metro Solutions Whitepaper

24 © 2012 VCE Company, LLC. All Rights Reserved.

Page 25: Vplex Metro Solutions Whitepaper

25 © 2012 VCE Company, LLC. All Rights Reserved.

Shared VPLEX configuration

Two Vblock systems can share a single VPLEX cluster. The following diagram shows how the Vblock systems connect to the front-end (FE) and back-end (BE) ports on the VPLEX engine. In this configuration, a separate VPLEX Witness is required for each Vblock system.

Page 26: Vplex Metro Solutions Whitepaper

26 © 2012 VCE Company, LLC. All Rights Reserved.

Data flow

The following diagram illustrates the logical flow of data between sites.

Page 27: Vplex Metro Solutions Whitepaper

27 © 2012 VCE Company, LLC. All Rights Reserved.

Failure scenarios

The following sections describe failure scenarios with their associated VPLEX behavior and recovery procedures.

Application and management failover

Distributed virtual volumes managed by VPLEX at both sites are 100% in sync at all times. With VPLEX Witness, in the event of the failure of one site, the data is available immediately at the alternate site in a crash consistent state. No scripting, failover declaration, or actions are required.

If the servers at the failed site have gone down, then VMware HA restarts the affected virtual machines at the other site automatically—similar to the failure of a single or set of servers in a single site.

VMware vCenter Server Heartbeat ensures the ability to fail over vCenter management if the primary vCenter instance has gone down. Heartbeat monitors the availability of all components of vCenter Server at the application and service layer, with the ability to restart or restore individual services. It uses a passive server instance to provide rapid failover and failback of vCenter Server and its components.

Refer to the VMware vCenter Server Heartbeat Administrator Guide for detailed information about configuring Heartbeat for failover and recovering from a failover.

The following factors determine how long it takes an application to become operational after failover:

Number of virtual machines being restarted

Location of the application in the priority sequence

Any application-dependent tasks that must be performed to get the application restarted from a crash-consistent state (for example, reestablish credentials, file-system check, database log rollback)

Application and management failback

After the failed site is restored, the applications that failed over must be failed back. VPLEX resynchronizes the distributed virtual volumes and makes them available at the previously failed site automatically. If VMware DRS is in use, it moves virtual machines back to the previously failed site automatically according to the policies with which it has been configured. It is also possible to move virtual machines back manually with a vMotion operation.

Depending on the configuration, some manual intervention may be required to make the vCenter Server primary again for applications on the failed site. Refer to the VMware vCenter Server Heartbeat Administrator Guide for detailed information about making the primary server active again.

Page 28: Vplex Metro Solutions Whitepaper

28 © 2012 VCE Company, LLC. All Rights Reserved.

Failure scenarios

The following sections provide a comprehensive list of failure scenarios for Non Cross-Cluster Connect and Cross-Cluster Connect configurations. Each scenario includes the associated VPLEX behavior and VMware HA recovery procedures.

The deployment for these scenarios consists of a VMware HA/DRS cluster across two sites using ESXi 5.0 hosts. vCenter Server 5.0 manages the cluster and connects to the ESXi hosts at both sites. The vSphere management, vMotion management, and virtual machine networks are connected using a redundant network between the two sites.

A VPLEX Metro solution federated across the two sites provides the distributed storage to the ESXi hosts. The SAN Boot LUN is on the back-end storage array, but not on the distributed virtual volume. The virtual machine runs on the preferred site of the distributed virtual volume.

Refer to the EMC VPLEX Metro Witness Technology and High Availability TechBook for detailed descriptions of these failure scenarios.

Go to the VMware web site at www.vmware.com/support/ for the most up-to-date technical documentation.

Non Cross-Cluster Connect with Witness failure scenarios

Non Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Single VPLEX BE path failure VPLEX continues to operate using an alternate path to the same BE array. Distributed virtual volumes exposed to the ESXi hosts have no impact.

None

Single FE path failure The ESXi server is expected to use alternate paths to the distributed virtual volumes.

None

BE array failure at site A VPLEX continues to operate using the array at site B. When the array is recovered from the failure, the storage volume at site A is resynchronized from site B automatically.

None

BE array failure at site B VPLEX continues to operate using the array at site A. When the array is recovered from the failure, the storage volume at site B is resynchronized from site A automatically.

None

VPLEX director failure VPLEX continues to provide access to the distributed virtual volume through other directors on the same VPLEX cluster.

None

Page 29: Vplex Metro Solutions Whitepaper

29 © 2012 VCE Company, LLC. All Rights Reserved.

Non Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Complete site A failure The failure includes all ESXi hosts and the VPLEX cluster at site A.

VPLEX continues to serve I/O on the surviving site (site B). When the VPLEX at the failed site (site A) is restored, the distributed virtual volumes are synchronized automatically from the active site (site B).

Virtual machines running at the failed site fail. HA automatically restarts them on the surviving site.

Complete site B failure The failure includes all ESXi hosts and the VPLEX cluster at site A.

VPLEX continues to serve I/O on the surviving site (site A). When the VPLEX at site B is restored, the distributed virtual volumes are synchronized automatically from the active site (site A).

Virtual machines running at the failed site fail. HA automatically restarts them on the surviving site.

Multiple ESXi host failure(s)—Power off

None VMware HA restarts the virtual machines on any of the surviving ESXi hosts within the HA cluster.

Multiple ESXi host failure(s) —Network disconnect

None VMware HA continues to exchange cluster heartbeat through the shared datastore. No virtual machine failovers occur.

ESXi host experiences APD (All Paths Down) Encountered when the ESXi host loses access to its storage volumes (in this case, VPLEX volumes)

None In an APD (All Paths Down) scenario, the ESXi host must be restarted to recover. If the ESXi host is restarted, HA restarts the failed virtual machines on other surviving ESXi hosts within the HA cluster.

VPLEX inter-site link (ISL) failure; vSphere cluster management network intact

VPLEX transitions distributed virtual volumes on the non-preferred site to the I/O failure state. On the preferred site, the distributed virtual volumes continue to provide access.

Virtual machines running in the preferred site are not affected.

Virtual machines running in the non-preferred site experience I/O failure and fail. HA fails over these virtual machines on the other site. Best practice is to run the virtual machines on the preferred site.

VPLEX cluster failure The VPLEX at either site A or site B has failed, but ESXi and other LAN/WAN/SAN components are intact.

The I/O continues to be served on all the volumes on the surviving site.

The ESXi hosts located at the failed site experience an APD condition. The ESXi hosts must be restarted to recover from the failure.

Page 30: Vplex Metro Solutions Whitepaper

30 © 2012 VCE Company, LLC. All Rights Reserved.

Non Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Complete dual site failure Upon restoration of the two sites, the VPLEX continues to serve I/O. The best practice is to bring up the BE storage arrays first, followed by VPLEX.

The ESXi hosts should be brought up only after VPLEX is fully recovered and the distributed virtual volumes are synchronized. When the ESXi hosts at each site are powered on, the virtual machines are restarted and resume normal operations.

Director failure at one site (preferred site for a given distributed virtual volume) and BE array failure at the other site (secondary site for a given distributed virtual volume)

The surviving VPLEX directors within the VPLEX cluster with the failed director continue to provide access to the distributed virtual volumes. VPLEX continues to provide access to the distributed virtual volumes using the preferred site BE array.

None

VPLEX ISL intact; vSphere cluster management network failure

None Virtual machines on each site continue running on their respective hosts since the HA cluster heartbeats are exchanged through the shared datastore.

VPLEX ISL failure; vSphere cluster management network failure simultaneously

VPLEX fails I/O on the non-preferred site for a given distributed virtual volume. The volumes continue to have access on the distributed virtual volume on the preferred site.

For virtual machines running in the preferred site, powered-on virtual machines continue to run. This is an HA split brain situation. The non-preferred site thinks that the hosts of the preferred site are dead and tries to restart the powered-on virtual machines of the preferred site.

For virtual machines running in the non-preferred site, these virtual machines see their I/O as failed and the virtual machines fail. These virtual machines can be registered and restarted on the preferred site.

VPLEX storage volume is unavailable (for example, it is accidentally removed from the storage view or the ESXi initiators are accidentally removed from the storage view)

VPLEX continues to serve I/O on the other site where the volume is available.

If the I/O is running on the lost device, ESXi detects a PDL (Permanent Device Loss) condition. The virtual machine is killed by VM Monitor and restarted by HA on the other site.

VPLEX intersite WAN link failure and simultaneous VPLEX Witness to site B link failure

The VPLEX fails I/O on the distributed virtual volumes at site B and continues to serve I/O on site A.

The virtual machines at site B fail. They can be restarted at site A. There is no impact on the virtual machines running at site A.

Page 31: Vplex Metro Solutions Whitepaper

31 © 2012 VCE Company, LLC. All Rights Reserved.

Non Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

VPLEX intersite WAN link failure and simultaneous VPLEX Witness to site A link failure

The VPLEX fails I/O on the distributed virtual volumes at site A and continues to serve I/O on site B.

It has been observed that the virtual machines at site A fail. They can be restarted at site B. There is no impact on the virtual machines running at site B.

VPLEX Witness failure VPLEX continues to serve I/O at both sites.

None

VPLEX Management Server failure

None None

vCenter Server failure None No impact on the running virtual machines or HA. However, the DRS rules and virtual machine placements are not in effect.

Cross-Cluster Connect with Witness failure scenarios

Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Single VPLEX back-end (BE) path failure

VPLEX continues to operate using an alternate path to the same BE Array. Distributed virtual volumes exposed to the ESXi hosts have no impact.

None

Single front-end (FE) path failure

The ESXi server is expected to use alternate paths to the distributed virtual volumes.

None

BE Array failure at site A VPLEX continues to operate using the array at site B. When the array is recovered from the failure, the storage volume at site A is resynchronized from site B automatically.

None

VPLEX director failure VPLEX continues to provide access to the distributed virtual volume through other directors on the same VPLEX cluster.

None

Complete site A failure The failure includes all ESXi hosts and the VPLEX cluster at site A.

VPLEX continues to serve I/O on the surviving site (site B). When the VPLEX at the failed site (site A) is restored the distributed virtual volumes are synchronized automatically from the active site (site B).

Virtual machines running at the failed site fail. HA automatically restarts them on the surviving site.

Page 32: Vplex Metro Solutions Whitepaper

32 © 2012 VCE Company, LLC. All Rights Reserved.

Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Complete site B failure The failure includes all ESXi hosts and the VPLEX cluster at site A.

VPLEX continues to serve I/O on the surviving site (site A). When the VPLEX at site B is restored, the distributed virtual volumes are synchronized automatically from the active site (site A).

Virtual machines running at the failed site fail. HA automatically restarts them on the surviving site.

Multiple ESXi host failure(s)—Power off

None HA restarts the virtual machines on any of the surviving ESXi hosts within the HA cluster.

Multiple ESXi host failure(s)—Network disconnect

None HA continues to exchange cluster heartbeat through the shared datastore. No virtual machine failovers occur.

ESXi host experiences APD (All Paths Down) Encountered when the ESXi host loses access to its storage volumes (in this case, VPLEX volumes)

None In an APD (All Paths Down) scenario, the ESXi host must be restarted to recover. If the ESXi host is restarted, HA restarts the failed virtual machines on other surviving ESXi hosts within the HA cluster.

VPLEX ISL failure; vSphere cluster management network intact, the Cross-Cluster Connect SAN ISL intact

VPLEX transitions distributed virtual volumes on the non-preferred site to the I/O failure state. On the preferred site, the distributed virtual volumes continue to provide access.

No impact on the virtual machines, because the datastore is available to the ESXi hosts through the preferred site.

VPLEX ISL failure; vSphere cluster management network intact; the Cross-Cluster Connect ISL also failed

VPLEX transitions distributed virtual volumes on the non-preferred site to the I/O failure state. On the preferred site, the distributed virtual volumes continue to provide access.

The virtual machines running on the non-preferred site of the distributed virtual volume fail. These virtual machines can be restarted manually on an ESXi host at the other (preferred) site.

VPLEX cluster failure The VPLEX at either site A or site B has failed, but ESXi and other LAN/WAN/SAN components are intact.

I/O continues to be served on all the volumes on the surviving site.

None of the virtual machines are affected. All ESXi hosts maintain a connection to the surviving VPLEX cluster and continue to have access to all datastores.

Complete dual site failure Upon restoration of the two sites, VPLEX continues to serve I/O. Best practice is to bring up the BE storage arrays first, followed by VPLEX.

The ESXi hosts should be brought up only after VPLEX is fully recovered and the distributed virtual volumes are synchronized. When the ESXi hosts at each site are powered on, the virtual machines are restarted and resume normal operations.

Page 33: Vplex Metro Solutions Whitepaper

33 © 2012 VCE Company, LLC. All Rights Reserved.

Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

Director failure at one site (preferred site for a given distributed virtual volume) and BE array failure at the other site (secondary site for a given distributed virtual volume)

The surviving VPLEX directors within the VPLEX cluster with the failed director continue to provide access to the distributed virtual volumes. VPLEX continues to provide access to the distributed virtual volumes using the preferred site’s BE array.

None

VPLEX ISL intact; vSphere cluster management network failure

None Virtual machines on each site continue running on their respective hosts, because the HA cluster heartbeats are exchanged through the shared datastore.

VPLEX ISL failure; vSphere cluster management network failure

VPLEX fails I/O on the non-preferred site for a given distributed virtual volume. The volumes continue to have access on the distributed virtual volume on its preferred site.

There are two possible scenarios: 1 If the ESXi hosts have not lost

their cross-site storage connection, then none of the virtual machines are affected because all ESXi hosts have a path to the preferred site, which remains active following this failure.

2 If, instead, the VPLEX ISL and the ESXi remote VPLEX links have gone down, then virtual machines running in the non-preferred site see their I/Os as failed, and the virtual machines fail. These virtual machines can be registered and restarted on the preferred site.

VPLEX storage volume is unavailable (for example, it is accidentally removed from the storage view, or the ESXi initiators are accidentally removed from the storage view)

VPLEX continues to serve I/O on the other site where the volume is available.

None Each ESXi host maintains access to the VPLEX virtual volume through the alternate VPLEX cluster.

Page 34: Vplex Metro Solutions Whitepaper

34 © 2012 VCE Company, LLC. All Rights Reserved.

Cross-Cluster Connect Scenario

VPLEX Behavior Impact/Observed VMware HA Behavior

VPLEX intersite WAN link failure and simultaneous VPLEX Witness to site B link failure

VPLEX fails I/O on the distributed virtual volumes at site B and continues to serve I/O at site A.

There are two possible scenarios: 1 If the ESXi hosts have not lost

their cross-site storage connection, then none of the virtual machines are affected, because all ESXi hosts have a path to the preferred site, which remains active following this failure.

2 If, instead, the VPLEX ISL and the ESXi to remote VPLEX links go down, then virtual machines running at site B go down. These virtual machines can be restarted manually on an ESXi host at site A. Virtual machines running at site A are not affected.

VPLEX intersite WAN link failure and simultaneous VPLEX Witness to site A link failure

VPLEX fails I/O on the distributed virtual volumes at site A and continues to serve I/O at site B.

There are two possible scenarios: 1 If the ESXi hosts have not lost

their cross-site storage connection, then none of the virtual machines are affected because all ESXi hosts have a path to the preferred site, which remains active following this failure.

2 If, instead, both the VPLEX ISL and the ESXi to remote VPLEX link go down, then virtual machines running at site A go down. These virtual machines can be manually restarted on an ESXi host at site B. Virtual machines running at site B are not affected.

VPLEX Witness failure VPLEX continues to serve I/O at both sites.

None

VPLEX Management Server failure

None None

vCenter Server failure None No impact on the running virtual machines or HA. However, the DRS rules and virtual machine placements are not in effect.

Page 35: Vplex Metro Solutions Whitepaper

35 © 2012 VCE Company, LLC. All Rights Reserved.

Conclusion

High availability and data mobility are key requirements for efficient, cost-effective IT operations in virtualized data centers. VPLEX Local or Metro with VMware vMotion provides a comprehensive solution that fulfills these requirements and ensures business continuity and workload mobility for Vblock systems. VPLEX Metro enables transparent load sharing among multiple sites and the flexibility to relocate workloads between sites in anticipation of planned events, such as data center relocations and site maintenance.

In the event that a site fails unexpectedly, failed services can be restarted at the surviving site with minimal effort and time to recovery. In a VPLEX with VPLEX Witness configuration, applications continue to operate in the surviving site with no interruption or downtime.

The combination of Vblock systems, VPLEX, and vMotion provides new ways to solve IT problems, allowing administrators to:

Move applications and their data between data centers with no disruption.

Provide continuous operations during and after site disasters.

Balance workloads across Vblock systems.

Collaborate over distance with shared data.

Aggregate data centers and provide 24/7 availability.

Next steps

To learn more about this and other solutions, contact a VCE representative or go to www.vce.com.

Page 36: Vplex Metro Solutions Whitepaper

36 © 2012 VCE Company, LLC. All Rights Reserved.

Additional references

Refer to the following documents and web resources for additional information on the topics in this white paper.

VCE

Vblock Infrastructure Platforms Architecture Overview

Vblock Solution for SAP Mobility

Workload Mobility with VMware vMotion and EMC VPLEX on Vblock Platforms

Enhanced Business Continuity with Application Mobility

VMware

VMware vCenter Server Heartbeat Administrator Guide

VMware vCenter Server web page

EMC

Some EMC technical documentation is available only on the EMC Powerlink website at http://Powerlink.EMC.com. Registration is required.

EMC VPLEX 5.0 Architecture Guide

EMC VPLEX with GeoSynchrony 5.0 and Point Releases Product Guide (P/N 300-012-307)

EMC VPLEX with GeoSynchrony 5.0 and Point Release CLI Guide (P/N 300-012-31)

EMC VPLEX Site Preparation Guide (P/N 300-010-495)

EMC VPLEX Metro Witness Technology and High Availability TechBook

Oracle Extended RAC With EMC VPLEX Metro Best Practices Planning

Page 37: Vplex Metro Solutions Whitepaper

© 2012 VCE Company, LLC. All Rights Reserved.

ABOUT VCE VCE, formed by Cisco and EMC with investments from VMware and Intel, accelerates the adoption of converged infrastructure and cloud-based computing models that dramatically reduce the cost of IT while improving time to market for our customers. VCE, through the Vblock system, delivers the industry's first completely integrated IT offering with end-to-end vendor accountability. VCE solutions are available through an extensive partner network, and cover horizontal applications, vertical industry offerings, and application development environments, allowing customers to focus on business innovation instead of integrating, validating and managing IT infrastructure. For more information, go to www.vce.com.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." VCE MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OR MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Copyright © 2012 VCE Company, LLC. All rights reserved. Vblock and the VCE logo are registered trademarks or trademarks of VCE Company, LLC and/or its affiliates in the United States or other countries. All other trademarks used herein are the property of their respective owners.