VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

41
Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager Kannan Mani, VMware Brad Pinkston, VMware BCO4905 #BCO4905

description

VMworld 2013 Kannan Mani, VMware Brad Pinkston, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

Transcript of VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

Page 1: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

Disaster Recovery Solution with Oracle Data Guard

and Site Recovery Manager

Kannan Mani, VMware

Brad Pinkston, VMware

BCO4905

#BCO4905

Page 2: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

2

Agenda

Introduction

SRM and Oracle Data Guard

Architecture Overview

Demo

Best Practices

Summary

Q&A

Page 3: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

3

Introduction

Page 4: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

4

Kannan Mani

15+ years Oracle experience : Oracle RAC, ASM, Clustering, CRM,

ERP, Business Intelligence, Performance and Scalable Enterprise

Application Architecture, Benchmark and Performance , Technical

solutions marketing and management, Virtualization and Cloud

solutions.

Oracle ACE – Applications, DB

Speakers @ Oracle Open World, IOUG, VMWorld, VMware Partner

Exchange, EMC World and Webinars

Industry recognized expert in Oracle and Virtualization

technologies.

Blog: http://blogs.vmware.com/apps/oracle

Page 5: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

5

SRM and Oracle Data Guard

Page 6: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

6

SRM Provides Broad Choice of Replication Options

vSphere Replication

Simple, cost-efficient replication for Tier 2 applications and smaller sites

Storage-based Replication

High-performance replication for business-critical applications in larger sites

vCenter Server Site

Recovery Manager

vSphere

vCenter Server Site

Recovery Manager

vSphere

vSphere Replication

Storage-based replication

Site A (Primary) Site B (Recovery)

Page 7: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

7

vSphere Replication Complements Storage-Based Replication

Replication

Provider Cost Management Performance

vSphere

Replication VMware

• Low-end storage supported

• No additional replication software

• VM’ granularity

• Managed directly in vCenter

• 15 min RPOs

• Scales to 500 VMs

• File-level consistency

• No automated failback, FT, linked clones, physical RDM

Storage-based

Replication

• Higher-end replicating storage

• Additional replication software

• LUN – VM layout

• Storage team coordination

• Synchronous replication

• High data volumes

• Application consistency possible

Page 8: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

8

Oracle Data Guard

http://www.oracle.com/technetwork/database/features/availability/twp-dataguard-11gr2-1-131981.pdf

Oracle Data Guard provides the management, monitoring, and automation software infrastructure to create and maintain one or more standby databases to protect Oracle data from failures, disasters, errors, and data corruptions. Data Guard is unique among Oracle replication solutions in supporting both synchronous (zero data loss) and asynchronous (near-zero data loss) configurations

Administrators can chose either manual or automatic failover of production to a standby system if the primary system fails in order to maintain high availability for mission critical applications

Page 9: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

9

Architecture Overview

Page 11: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

11

Steps Tested

1 Oracle Primary DB at Site A and

Standby DB at Site B with Data Guard

3 Site A Down - SAP Application and

Central services VM replicated to

Site B using vSphere replication

4 Failover Oracle Primary to Standby using

SRM Call out Script from SAP Application VM

at Site B

5 Connect/Resume SAP application to the Oracle Database in site B

2 SAP Application connected to Primary

DB at Site A

Page 12: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

12

Demo

Page 13: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

13

SRM Callout Script – odgfail.sh (Example)

~ # cat odgfail.sh #! /bin/sh ####################################################################################### # file name : odgfail.sh # location : /scripts # called from : Application VM on Site B ####################################################################################### echo "Job `basename $0`: started at `date`" # # Set up standard ORACLE environment variables ORACLE_SID=stdby; export ORACLE_SID ORACLE_BASE=/oracle; export ORACLE_BASE ORACLE_HOME=/oracle/PRD/102_64; export ORACLE_HOME PATH=/oracle/PRD/102_64/bin:.:/oracle/PRD:/usr/sap/PRD/SYS/exe/run:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin;export PATH LD_LIBRARY_PATH=/usr/sap/PRD/SYS/exe/run:/oracle/client/10x_64/instantclient; export LD_LIBRARY_PATH # # Failover to Standby $ORACLE_HOME/bin/sqlplus /nolog <<EOFarch1 connect / as sysdba --shutdown Primary database(in case of RAC, shutdown all RAC instances) --Initiate failover to Standby Database: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE; --Convert the physical standby database to the production role: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; --Comment/Uncomment either of the 2 sets of commands below --If the database was never opened read-only since the last time it was started, --open new production database via: ALTER DATABASE OPEN; --If the physical standby database has been opened in read-only mode since the last time it was started, --shutdown standby database and restart it --SHUTDOWN IMMEDIATE --STARTUP pfile=initSTDBY.ora exit EOFarch1 echo "Job `basename $0`: ended at `date`" ########################## end of script ~ #

Page 14: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

14

EMC Reference Architecture

Page 15: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

15

EMC RA – Storage Replication Solution Overview

Page 16: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

16

Oracle Database Configuration – Storage Layouts

Page 17: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

17

Solution Testing Findings

Integration of RecoverPoint with vCenter Site Recovery Manager

enables DR testing to be carried out in isolated environments on

the recovery site so that production can remain active and

replication can continue uninterrupted. SRM also documents the

recovery process

RecoverPoint enables replication of entire virtualized Oracle

environments between data centers for disaster recovery

The RecoverPoint splitter supports replication across

heterogeneous storage platforms

Integration of RecoverPoint with vCenter Site Recovery Manager

enables DR testing to be carried out in isolated environments on

the recovery site so that production can remain active and

replication can continue uninterrupted

http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdf Download

Page 18: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

18

EMC RA – Storage Replication Solution Overview

Page 19: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

19

Oracle Database Configuration – Storage Layouts

Page 20: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

20

Solution Testing Findings

Integration of RecoverPoint with vCenter Site Recovery Manager

enables DR testing to be carried out in isolated environments on

the recovery site so that production can remain active and

replication can continue uninterrupted. SRM also documents the

recovery process

RecoverPoint enables replication of entire virtualized Oracle

environments between data centers for disaster recovery

The RecoverPoint splitter supports replication across

heterogeneous storage platforms

Integration of RecoverPoint with vCenter Site Recovery Manager

enables DR testing to be carried out in isolated environments on

the recovery site so that production can remain active and

replication can continue uninterrupted

http://www.emc.com/collateral/hardware/white-papers/h8207-dr-oracle-vmaxe-recoverpoint-srm-wp.pdf Download

Page 21: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

21

Best Practices

Page 22: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

22

Oracle DB on VMware Technical Best Practices

Server selection

Storage selection

vSphere version

vSphere operations

Performance monitoring

Guest operating system

configuration

• Virtual storage presentation

• Workload and datastore fan-in

ratios

• vCPU allocation

• Memory

• Network

• Security

• Cloning

• Disaster recovery

Page 23: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

23

General Best Practices

• Create a computing environment optimized for vSphere

• Enable required settings for ESX host BIOS – for example VT,

Turbo Mode, hyper-threading

• Disable unnecessary foreground and background processes on

guest operating system

• Create golden images of optimized operating systems using

vSphere cloning technologies

• Upgrade to vSphere ESX 5 for 10–20 % performance boost

• Allow vSphere to choose the best virtual machine monitor based on the CPU

and guest operating system combination. Virtual machine setting must be

selected Automatic for the CPU/MMU Virtualization option.

• Use Oracle recommended installation guidelines for respective operating

system – same as physical

• To minimize time drift in virtual machines follow guidelines in KB articles

Timekeeping best practices for Linux guests http://kb.vmware.com/kb/1006427

Timekeeping best practices for Windows, including NTP http://kb.vmware.com/kb/1318

VMware vSphere 4.1

OS

Page 24: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

24

Virtual CPUs

Best Practices for vCPUs

• Do not over-allocate vCPUs – try to match the exact workload

• If the exact workload is unknown, start with fewer vCPUs initially and

increase later if necessary

• For larger production workloads, the total number of vCPUs assigned to all

virtual machines should be less than or equal to the total number of cores

on the ESX host

• Enable hyper-threading for Intel Core i7 processors

• For 5500 series processors, enabling hyper-threading is recommended

• If unsure of the workload, use hardware vendor recommended Oracle sizing

guidelines

• Avoid remote NUMA access by sizing the number of vCPUs to be no

greater than the number of cores on a NUMA node (processor socket)

Page 25: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

25

Virtual Memory Best Practices

• Do not overcommit memory until vCenter reports that steady state

usage is below the amount of physical memory on the server

• Do not disable the balloon driver (installed with VMware Tools)

• Set the memory reservation to SGA size plus OS. (Reservation and

configured memory might be the same.)

• Enable hardware-assisted virtualization in the ESX host BIOS and on the VM

• Set CPU/MMU virtualization option to Automatic

• vSphere will choose best Virtual Machine Monitor option base on CPU/Guest OS

• Use Large Memory Pages

• Consult Oracle Administration Guide for sizing of SGA

Page 26: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

26

Network Best Practices

• Separate infrastructure traffic from virtual machine traffic for

security and isolation

• Use NIC teaming for availability and load balancing

• Take advantage of Network I/O Control (NIOC) to converge network and

storage traffic onto 10GbE

• For “chatty” virtual machines on same host, connect to same vSwitch to avoid

NIC traffic

• Use VMXNET3 Paravirtualized network adapter drivers to increase

performance

• Reduces overhead versus vlance or E1000 emulation

• Must have VMware Tools to enable VMXNET3

• Use jumbo frames

• To configure, see iSCSI and Jumbo Frames configuration on ESX 3.x and ESX 4.x

http://kb.vmware.com/kb/1007654

• Separate RAC interconnect network to isolate it from other traffic

Page 27: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

27

Storage Virtualization Concepts

• Storage array – consists of physical disks that are presented as

logical disks (storage array volumes or LUNs) to the ESX host

• Storage array LUNs – formatted as VMware vSphere® VMFS volumes

• Virtual disks – presented to

the guest operating system,

and can be partitioned and

used in guest file systems

Page 28: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

28

Storage Best Practices

• Use vSphere VMFS for single instance Oracle database

deployments

• For IP-based storage (iSCSI and NFS), enable jumbo frames

• Create dedicated data stores to service database workloads

• Align VMFS properly – Use vCenter to create VMFS partitions, because it

automatically aligns the partitions

• Use Oracle automatic storage management

• Follow your storage vendor’s best practices documentation when laying out

the Oracle database

• Use Paravirtualized SCSI adapters for Oracle datafiles with demanding

workloads

http://www.vmware.com/files/pdf/partners/oracle/Oracle_Databases_on_VMware_-_Best_Practices_Guide.pdf Download

Page 29: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

29

Summary

Page 30: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

30

Performance

Rapid Provisioning

I/O is not an issue

Scale up and out

Newer hardware can increase performance

Streamline activation, deployment, and validation of servers

Avoid manual configuration errors

Server Consolidation

Fully utilize hardware

Maintain application isolation

Scale dynamically and right-size infrastructure

Workload Management

Business Continuity

High Availability

VMware vSphere® vMotion®, VMware vSphere High Availability (HA),

VMware vSphere® Fault Tolerance (FT), VMware vSphere Distributed

Resource Scheduler (DRS)

Without clustering or RAC

VMware vCenter Site Recovery Manager™

Hardware reduction at failover site

Comprehensive testing of DR solution

Benefits of Oracle Databases on VMware

Zero downtime maintenance

Migrate live databases

Page 31: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

31

Where Can I Learn More?

vCenter Site Recovery Manager

• Product Page – www.vmware.com/products/srm

• Overview, datasheet, webinars, docs, community links

Oracle Data Guard

• Overview –

http://www.oracle.com/technetwork/database/features/availability/dataguardov

erview-083155.html

Virtualizing Oracle with VMware

• External Solution Page – http://www.vmware.com/solutions/business-critical-

apps/oracle-virtualization/oracle-database.html

Blog

• http://blogs.vmware.com/apps/oracle/

Page 32: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

32

Questions

Page 33: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

33

Disaster Recovery Solution with Oracle

Data Guard and Site Recovery Manager

VMware, Inc.

3401 Hillview Ave

Palo Alto, CA 94304

Tel: 1-877-486-9273 or 650-427-5000

Fax: 650-427-5001

Page 34: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

34

Other VMware Activities Related to This Session

HOL:

HOL-SDC-1305

Business Continuity and Disaster Recovery In Action

Group Discussions:

BCO1003-GD

Disaster Recovery and Replication with Ken Werneburg

BCO4905

Page 35: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

THANK YOU

Page 36: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager
Page 37: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

Disaster Recovery Solution with Oracle Data Guard

and Site Recovery Manager

Kannan Mani, VMware

Brad Pinkston, VMware

BCO4905

#BCO4905

Page 38: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

38

Backup Slides

Page 39: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

39

Failover

A failover is performed when the production database fails and one of the standby databases is transitioned to take over

the production role, allowing business operations to continue. Once the failover is complete and applications have

resumed, the administrative staff can turn its attention to resolving the problems with the failed system. Failover may or

may not result in data loss depending on the Data Guard protection mode in effect at the time of the failover. There are

two distinct types of failover: manual failover and fast-start failover

Steps after Primary database crashes :

Step No. Standby Site

1 Initiate failover to Standby Database:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE.

In rare circumstances DBA’s may wish to avoid waiting for the standby to complete applying redo in the current standby redo

log file before performing the failover and so may issue an ‘ALTER DATABASE ACTIVATE STANDBY DATABASE’ command

to perform an immediate failover, this will cause any un-applied redo in the standby redo log to be lost.

2 Convert the physical standby database to the production role:

ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY

3 If the database was never opened read-only since the last time it was started, open new production database via:

ALTER DATABASE OPEN

If the physical standby database has been opened in read-only mode since the last time it was started, shutdown standby

database and restart it

SHUTDOWN IMMEDIATE

STARTUP

Page 40: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

40

Switchover

Switchover is a planned role reversal between the production database and one of its standby databases to avoid

downtime during scheduled maintenance on the production system or to test readiness for future role transitions. A

switchover guarantees no data loss.

Steps :

Step No. Primary Site Standby Site

1 Get Status of Primary Database :

SELECT NAME, DB_UNIQUE_NAME, LOG_MODE,

OPEN_MODE, PROTECTION_MODE,

PROTECTION_LEVEL, DATABASE_ROLE,

SWITCHOVER_STATUS FROM V$DATABASE

Ensure both log_archive_dest_state_1 (Local Archiving) and

log_archive_dest_state_2 (Archiving to Standby) are enabled

Get Status of Standby Database :

SELECT NAME, DB_UNIQUE_NAME, LOG_MODE,

OPEN_MODE, PROTECTION_MODE,

PROTECTION_LEVEL, DATABASE_ROLE,

SWITCHOVER_STATUS FROM V$DATABASE

Ensure log_archive_dest_state_1 (Local Archiving) is

enabled and log_archive_dest_state_2 (Archiving to

Primary) is disabled. Ensure NO gaps in redo on the

standby database

2 Verify that it is possible to perform a switchover operation:

SELECT SWITCHOVER_STATUS FROM V$DATABASE

if output is ‘SESSIONS ACTIVE’ then disconnect all sessions

manually or when performing step 3 append the “with session

shutdown” clause

3 Convert the current primary database to the new physical

standby:

ALTER DATABASE COMMIT TO SWITCHOVER TO

PHYSICAL STANDBY WITH SESSIONS SHUTDOWN

Page 41: VMworld 2013: VMware Disaster Recovery Solution with Oracle Data Guard and Site Recovery Manager

41

Switchover (cont’d)

Step No. Primary Site Standby Site

4 Shutdown the former primary and mount as a standby

database:

SHUTDOWN IMMEDIATE

STARTUP NOMOUNT PFILE= initPRD.ora

ALTER DATABASE MOUNT STANDBY DATABASE

Defer the remote archive destination on the old primary:

ALTER SYSTEM SET log_archive_dest_state_2=DEFER

Verify that the old physical standby can be converted to

the new primary:

SELECT SWITCHOVER_STATUS FROM V$DATABASE

5 Convert the old physical standby to the new primary:

ALTER DATABASE COMMIT TO SWITCHOVER TO

PRIMARY WITH SESSIONS SHUTDOWN

If the physical standby database has not been opened in

read-only mode since the last time it was started:

ALTER DATABASE OPEN

Shutdown and startup the new primary database:

SHUTDOWN IMMEDIATE

STARTUP PFILE= initSTDBY.ora

6 Start managed recover on the new standby database:

ALTER DATABASE RECOVER MANAGED STANDBY

DATABASE DISCONNECT FROM SESSION

Enable remote archiving on the new primary to the new

standby:

ALTER SYSTEM SET

log_archive_dest_state_2=ENABLE