AppManager Optimizing Business Continuity … White Paper Optimizing Business Continuity Management...

16
Optimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager White Paper PlateSpin Protect AppManager

Transcript of AppManager Optimizing Business Continuity … White Paper Optimizing Business Continuity Management...

Optimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

White PaperPlateSpin ProtectAppManager

Table of Contents page

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Why Monitor PlateSpin Protect Servers? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

What to Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Where to Get the Knowledge Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

How to Install the Knowledge Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Using the Scripts without AppManager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Where to Get Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Appendix A: Knowledge Script Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Appendix B: Event Log Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Appendix C: Typical AppManager Server Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1www.microfocus.com

Introduction

This document describes how organizations can use NetIQ® AppManager® to proactively monitor

the availability, capacity, health, and quality of a business continuity solution built with Micro Focus®

PlateSpin® Protect . Although this document will refer to PlateSpin Protect throughout, it is equally

relevant for PlateSpin Forge®, the hardware appliance version of PlateSpin Protect .

PlateSpin Protect is a software-based disaster recovery product . It uses a VMware-based virtual

infrastructure as the target platform for replication of one or more source workloads that need

protection against disasters or power outages . These source servers may be physical or virtual

Windows and Linux systems . Once the protection has been configured, the systems are replicated

into warm stand-by virtual machines (VMs) in the VMware virtual infrastructure . In case of a disaster,

the warm stand-by VMs are booted to ensure business service continuity . Once the source servers

have been rebuilt, the contents of the warm stand-by VMs can be replicated back to these

original servers .

AppManager is an IT operations management tool designed for IT operations teams who need rapid

time to value and the flexibility to support the diverse needs of multiple business units . It provides

broader and deeper technology support than competing products, enables greater IT process

automation to reduce monitoring gaps, and delivers more relevant events so you can troubleshoot

IT problems faster . For more information on these solutions, visit: www.microfocus.com

Why Monitor PlateSpin Protect Servers?

Organizations that deploy PlateSpin Protect recognize its importance in providing business continuity

for essential business services . Because a disaster recovery service could be called upon at any time,

it makes sense to verify its availability, capacity, health, and quality in order to provide reassurance that

it is capable of meeting business expectations at all times . The importance of monitoring the disaster

recovery service becomes ever greater as the service grows and protects an increasing number of

essential workloads .

Because a disaster recovery service could be called upon at any time, it makes sense to verify its availability, capacity, health, and quality in order to provide reassurance that it is capable of meeting business expectations at all times .

2

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

Furthermore, although PlateSpin Protect has some in-built capabilities to notify individuals of certain

issues that might impact the disaster recovery service, these capabilities were not designed to cover

all aspects of its operation; nor are they an alternative to the monitoring capabilities of a dedicated

systems management solution .

What to Monitor

A number of monitoring scripts have been developed for AppManager to provide comprehensive

insight into the health, availability, capacity, and quality of a number of PlateSpin Protect servers;

these are known as Knowledge Scripts (KS) . Each KS focuses on a particular aspect of the

PlateSpin Protect service .

There is also a Discovery Knowledge Script that retrieves details of each PlateSpin Protect

implementation and represents them within the AppManager console . For additional details on

each script, see Appendix A .

In addition to these application-specific Knowledge Scripts, Micro Focus recommends deploying

general monitoring in order to monitor the Windows operating system of each Protect server .

Examples of components you should monitor include the processor, memory, disk and network

utilization, the up/down status of the OS and Windows services, and physical disk queue length .

Doing so will highlight any underlying OS issues that could impact the PlateSpin Protect application .

Micro Focus recommends deploying general monitoring in order to monitor the Windows operating system of each Protect server . Examples of components you should monitor include the processor, memory, disk and network utilization, the up/down status of the OS and Windows services, and physical disk queue length .

Service Aspect

Knowledge Script Name

Purpose

Availability and health Service Down Monitor status (up or down) of Windows services related to PlateSpin Protect, such as the Management Service and Microsoft SQL Server services . An Event is raised if any service is not running or cannot be restarted .

Availability and health Web Interface Status Monitors the availability of the Protect Web Interface . An Event is raised if the website does not load successfully or times out .

Availability and health Workload Status Monitors the Windows Application Event Log for entries relating to the status of protected workloads such as missed or failed replications . Refer to Appendix B or the Protect documentation for further details . Note that this KS requires PlateSpin Protect version 11 .1 or later .

Availability and health Workload Protection Level Monitors the number of workloads by Protection Level .

Capacity Container Space Monitors the utilization and available space of Protect Containers .

Capacity License Usage Monitors the utilization and available PlateSpin Protect licenses .

Quality Last Test Failover Monitors the age of the last test failover performed on each protected workload .

3www.microfocus.com

Where to Get the Knowledge Scripts

The Knowledge Scripts are available in the AppManager Community Package . This is a set of scripts

and utilities developed by staff, customers, and partners that is shared as is with other AppManager

users . The Community Package is updated periodically and is available to download on the

AppManager module updates page, on the Micro Focus website .

Please note that the AppManager Community Package is provided on an as-is, as-available basis

without any warranty of any kind .

How to Install the Knowledge Scripts

If you have already deployed AppManager in your environment, start by installing the AppManager

for Windows agent on each PlateSpin Protect Server . If you have not already deployed AppManager,

you need a dedicated Windows server; the suggested hardware and software specifications are

described in Appendix C . AppManager may be installed and evaluated for 30 days, after which

it will require licensing to continue operating . To license AppManager appropriately for Protect,

one AppManager for Windows license is required for each Protect Server to be monitored, plus an

additional license for the AppManager server itself . The software can be downloaded from:

www.microfocus.com

The Knowledge Scripts are installed into AppManager as follows:

1. Copy and extract the Protect zip file onto a machine where an AppManager console is installed .

2. Open a command prompt in the location of the extracted files and enter the following command:

prompt> typeutil checkin <sql instance>:<repository name>:: ProtectObjType.xml

For example, if the AppManager repository database is installed in a default SQL Server instance on LONNQAM01 and is named QDB, then the command would be:

prompt> typeutil checkin LONNQAM01:QDB:: ProtectObjType.xml

3. Open the AppManager Control Center Console and navigate to the Knowledge Scripts view of the Master Management Group . Then right-click and select the menu option Check In Knowledge Script . Navigate to the folder where you extracted the files and select all QML files .

If you have already deployed AppManager in your environment, start by installing the AppManager for Windows agent on each PlateSpin Protect Server . If you have not already deployed AppManager, you need a dedicated Windows server .

4

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

This monitoring solution was designed for AppManager, and it is recommended that you install AppManager to implement it .

4. From the Discovery category of the Knowledge Scripts view of the Master Management Group, deploy the Knowledge Script Discovery_Protect to run once on each PlateSpin Protect server on which an AppManager agent has already been installed (and the Windows module has been discovered) . The details of each Protect server should now be visible in the console, and the Knowledge Scripts described previously can now be deployed to monitor the various aspects of the application .

It is recommended that you deploy these Knowledge Scripts using Monitoring Policy by creating a Management Group containing PlateSpin Protect resources and an associated monitoring standard defined in a Knowledge Script Group .

Should you require any assistance for these activities, refer to the AppManager user guides, which are

included with the source files .

Using the Scripts without AppManager

This monitoring solution was designed for AppManager, and it is recommended that you install

AppManager to implement it . However, if you prefer to use another systems management tool, it might

be possible to port the Knowledge Scripts .

All of the PlateSpin Protect Knowledge Scripts are written in VBScript; as such, it might be possible

to port them to another systems management solution that supports the same scripting language and

that takes a similar monitoring approach . However, this approach requires some recoding to substitute

AppManager-specific activities such as creating Events, collecting Data, and debug logging .

All of the Knowledge Scripts also use information provided by discovery for tasks such as presenting

Events and Data in the appropriate part of the resource tree; these references would also need to be

replaced . The scripts are written in a way that makes it easy to convert to VBS format .

Two of the Knowledge Scripts, namely Workload Status and Service Down, use AppManager-specific

components; as such, they cannot be directly ported to another platform . However, scanning Event

Logs and checking Windows Services are capabilities typically available in many systems management

tools and should be relatively simple to substitute .

You are welcome to develop your own monitoring solution for PlateSpin Protect in any systems

management tool, using concepts taken from this white paper .

5www.microfocus.com

Where to Get Help

More detailed information on this solution is available in the appendices . If you need general

assistance with AppManager, you can use regular channels such as the customer forum, your account

manager, or Technical Support . If you have questions that relate specifically to these Knowledge

Scripts, you can contact their author, Alain Salesse, at: [email protected]

Appendix A: Knowledge Script Detail

This section describes the parameters associated with each Knowledge Script and provides a sample

screenshot of its output .

If you have questions that relate specifically to these Knowledge Scripts, you can contact their author, Alain Salesse, at: alain.salesse @microfocus.com

Discovery_PlateSpinProtect Purpose Discovers the resources associated with a PlateSpin Protect implementation and represents them within the

AppManager console . This serves to control which Knowledge Scripts can be deployed and acts as a placeholder for representing alarms and metric data for charting and reporting .

Schedule The default schedule is to run once .

Parameters Raise Event for successful discovery Check this box to generate an Event when PlateSpin Protect Discovery is successful .

Event Severity—discovery okay Enter a value between 1 and 40 for the severity of the Event .

Raise Event for unexpected error Check this box to generate an Event when PlateSpin Protect Discovery fails or is partially successful .

Event Severity unexpected error Enter a value between 1 and 40 for the severity of the Event .

SQL User The KS attempts to connect to the SQL Server instance that hosts the PlateSpin Protect database . If the account that runs the AppManager agent service has permission to access the SQL Instance, then leave this parameter blank to use Windows authentication; otherwise, specify an SQL Login that has access .

SQL Password If an SQL Login has been specified, enter its password here . Note: If PlateSpin Protect was used to install Microsoft SQL Server Express and SQL authentication is required, leave the password value as is and specify as in the SQL User parameter .

View Name Specify the name of the discovery view .

6

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

Fig. 1

PlateSpin Protect Discovery details (composite)

Protect_ServiceDown Purpose Monitors the status of Windows services associated with PlateSpin Protect . Optionally attempts to restart a service

if it is not running .

Schedule The default schedule is every five minutes .

Parameters Monitor PlateSpin Protect Management Service?

Check this box to monitor the service if it was detected by discovery .

Monitor PlateSpin Operations Framework Controller?

Check this box to monitor the service if it was detected by discovery .

Monitor SQL Server Browser? Check this box to monitor the service if it was detected by discovery .

Monitor SQL Server? Check this box to monitor the service if it was detected by discovery .

Auto-start service? Check this box to attempt to restart the above services if they are not running .

Event severity: auto-start failed Enter a value between 1 and 40 for the severity of the Event when a service failed to restart .

Event severity: auto-start succeeded Enter a value between 1 and 40 for the severity of the Event when a service was not running but restarted successfully .

Event severity: service is down and auto-start is not enabled

Enter a value between 1 and 40 for the severity of the Event when a service is not running and a restart was not attempted .

Collect data for service status Check this box to collect data for the up/down status of each service . A value of 100 is returned when the service is up, and 0 when it is down .

7www.microfocus.com

Fig. 2

AppManager Events summary view

Protect_WebInterfaceStatus Purpose Monitors the availability of the PlateSpin Protect Web Interface . An Event is raised if the website does not load

successfully or times out .

Schedule The default schedule is every 15 minutes .

Parameters Web Interface URL The address for the PlateSpin Protect Web Interface . The default is: http://localhost/Protect

Page Timeout in Seconds Specify the maximum load time in seconds .

Event Severity—website unavailable Enter a value between 1 and 40 for the severity of the Event when the Web Interface does not load or times out .

Include HTML in Event Check this box to include the raw HTML in the Event .

8

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

Fig. 3

Web Interface Status Event

Protect_WorkloadStatus Purpose Monitors the Windows application Event Log for entries relating to the status of workloads with a protection

contract . Note that this KS requires PlateSpin Protect version 11 .1 or later, and as such it will not execute if the discovered version is older .

Schedule The default schedule is every hour .

Parameters Monitor Error Events? Check this box to monitor events classified as Error . Refer to Appendix B for details .

Event Severity for Error Events Enter a value between 1 and 40 for the severity of the Event when Error entries are found .

Monitor Warning Events? Check this box to monitor events classified as Warning . Refer to Appendix B for details .

Event Severity for Warning Events Enter a value between 1 and 40 for the severity of the Event when Warning entries are found .

Monitor Info Events? Check this box to monitor events classified as Info . Refer to Appendix B for details .

Event Severity for Info Events Enter a value between 1 and 40 for the severity of the Event when Info entries are found .

Event Severity for unexpected error Enter a value between 1 and 40 for the severity of the Event when an unexpected error is encountered .

Maximum Entries to include in the Event Define the maximum number of entries to be included in the Event . If the number of records found exceeds this, then the most recent ones will be shown .

Hours to scan By default, the KS scans from the point it reached on its last execution . If a number greater than 0 is defined, the KS will scan that many hours back in the Event Log . If a value of -1 is defined, then the entire log will be scanned . Only whole numbers are accepted . This should be used for testing purposes only .

9www.microfocus.com

Note that these Protection Levels map to the workload status on the PlateSpin Protect home page

as follows: Protected = Green, Failed = Red, everything else = Amber .

Fig. 4

Workload Status Event

Protect_WorkloadProtectionLevel Purpose Monitors the number of workloads by Protection Level and raises an Event if the number exceeds the threshold .

The Protection Level is based on the workload status overview on the PlateSpin Protect Web Interface home page .

Schedule The default schedule is every 24 hours .

Parameters Monitor Protection Level Failed? Check this box to monitor events classified as Error . Refer to Appendix B for details .

Threshold for Protection Level Failed Enter a value between 1 and 40 for the severity of the Event when Error entries are found .

Event Severity for Protection Level Failed Check this box to monitor events classified as Warning . Refer to Appendix B for details .

Collect data for Protection Level Failed Enter a value between 1 and 40 for the severity of the Event when Warning entries are found .

There are equivalent parameters to the above for each of the following protection levels:

Under-protected Pending failed Unknown Expired Pending Protected

Event Severity for unexpected error Enter a value between 1 and 40 for the severity of the Event when an unexpected error is encountered .

10

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

Fig. 5

Workload Protection Event

Protect_ContainerSpace Purpose Monitors the percentage utilization and free space on each PlateSpin Protect Container .

Schedule The default schedule is every hour .

Parameters Monitor Container Utilization? Check this box to monitor the percentage utilization of each container .

Maximum Utilization Enter a value between 1 and 100 for the maximum utilization of a container .

Severity for Utilization above Threshold Enter a value between 1 and 40 for the severity of the Event when container utilization exceeds the threshold .

Monitor Container Free Space? Check this box to monitor the free space available in each container .

Minimum Free Space Enter the minimum free space in GB .

Severity for Free Space below Threshold Enter a value between 1 and 40 for the severity of the Event when free space is below the threshold .

Collect data for container utilization? Check this box to collect data for the percentage utilization of each container .

Collect data for container space used? Check this box to collect data for the space used in GB (that is, total—free) of each container .

Event Severity for unexpected error Enter a value between 1 and 40 for the severity of the Event when an unexpected error is encountered .

SQL User If the account that runs the AppManager agent service has permission to access the SQL Instance, leave this parameter blank to use Windows authentication; otherwise, specify an SQL Login that has access .

SQL Password If an SQL Login has been specified, enter its password here . Note: If PlateSpin Protect was used to install Microsoft SQL Server Express and SQL authentication is required, leave the password value as is and specify sa in the SQL User parameter .

11www.microfocus.com

Fig. 6

Container Space Event

Protect_LicenseUsage Purpose Monitors the percentage utilization and the number of available PlateSpin Protect licenses .

Schedule The default schedule is every 24 hours .

Parameters Monitor the number of licenses available? Check this box to monitor the number of workload licenses available on the PlateSpin Protect server .

Minimum number of licenses available Enter the threshold for the minimum number of licenses available .

Severity for number of licenses low Enter a value between 1 and 40 for the severity of the Event when the number of available licenses is below the threshold .

Monitor license utilization? Check this box to monitor the percentage of workload licenses used .

Maximum license utilization Enter the threshold for the maximum license utilization .

Severity for license utilization high Enter a value between 1 and 40 for the severity of the Event when the percentage of licenses used is too high .

Collect data for licenses available? Check this box to collect data for the number of licenses available .

Collect data for license utilization? Check this box to collect data for the percentage of licenses used .

12

White PaperOptimizing Business Continuity Management with Micro Focus PlateSpin Protect and AppManager

Fig. 7

License Utilization Event

Protect_LastTestFailover Purpose Monitors the age of the last test failover of each protected workload . Raises an event if the age exceeds the

threshold .

Schedule The default schedule is weekly at 11:00 am on Monday .

Parameters Maximum age of last test failover Enter a value for the maximum age of the last test failover in days or weeks .

Unit for age of last test failover Select Day or Week from the list as appropriate .

Event Severity for age threshold exceeded Enter a value between 1 and 40 for the severity of the Event when the last test failover is too old .

Event Severity for unexpected error Enter a value between 1 and 40 for the severity of the Event when an unexpected error is encountered .

13www.microfocus.com

Appendix B: Event Log Types

PlateSpin Protect records information in the Windows Application Event Log relating to the status of

protected workloads . This information is used by the Workload Status Knowledge Script . All entries

share a source named PlateSpinEvents . Note that this capability was introduced in version 11 .1,

and the Knowledge Script checks the discovered version before it executes . If an earlier version of

PlateSpin Protect has been updated to 11 .1 or later, re-run the Discovery script before deploying

the Workload Status KS .

Fig. 8

Last Test Failover Event

162-000117-001 | N | 02/17 | © 2017 Micro Focus . All rights reserved . Micro Focus, the Micro Focus logo, PlateSpin, and PlateSpin Forge, among others, are trademarks or registered trademarks of Micro Focus or its subsidiaries or affiliated companies in the United Kingdom, United States and other countries . NetIQ and AppManager are trademarks or registered trademarks of NetIQ Corporation in the USA . All other marks are the property of their respective owners .

Event type Condition Remarks Warning Full replication missed Similar to incremental replication missed .

Incremental replication missed

Generated when any of the following applies:

A replication is manually paused while a scheduled incremental replication is due .

The system attempts to carry out a scheduled incremental replication while a manually triggered replication is underway .

The system determines that the target has insufficient free disk space .

Workload offline detected Generated when the system detects that a previously online workload is now offline . Applies to workloads whose protection contract’s state is not Paused .

Error Failover failed

Full replication failed

Incremental replication failed

Prepare failover failed

Information Failover completed

Full replication completed

Incremental replication completed

Prepare failover completed

Test failover completed Generated upon manually marking a Test Failover operation a success or a failure .

Workload online detected Generated when the system detects that a previously offline workload is now online . Applies to workloads whose protection contract’s state is not Paused .

www.microfocus.com

Micro FocusUK HeadquartersUnited Kingdom+44 (0) 1635 565200

U.S. HeadquartersRockville, Maryland301 838 5000877 772 4450

Additional contact information and office locations: www.microfocus.com

Appendix C: Typical AppManager Server Specifications

The following represent typical hardware and software specifications for an AppManager

version 9 .1 or later deployment . The server can be physical or virtual .

Windows Server 2008R2, 2012, or 2012R2

1 or 2 processor cores

4GB RAM

Microsoft SQL Server 2012, 2012R2 or 2014 Express, or a local or remote “full” edition

Microsoft .NET Framework 3 .5 SP1

Microsoft XML Parser 3 .0 SP1 or later

Microsoft Distributed Transaction Coordinator service

Microsoft Background Intelligent Transfer Service (BITS)

Microsoft IIS with IIS 6 backwards compatibility components

ASP .NET with v2 .0 .50727 Web Service Extension enabled