Oracle Database Backup and Recovery with VMAX3 - EMC

WHITE PAPER

ORACLE DATABASE BACKUP, RECOVERY, AND REPLICATIONS BEST PRACTICES WITH VMAX ALL FLASH STORAGE January 2018

VMAX® Engineering White Paper

Abstract

This white paper provides details on the best practices for backup, recovery, and

replications of Oracle databases with Dell EMC® VMAX® All Flash storage arrays.

H14232.1

This document is not intended for audiences in China, Hong Kong, Taiwan, and

Macao.

Copyright

2 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash White Paper

The information in this publication is provided as is. Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable software license.

Copyright © 2018 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Intel, the Intel logo, the Intel Inside logo and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Other trademarks may be the property of their respective owners. Published in the USA January 2018 White Paper H14232.1

Dell Inc. believes the information in this document is accurate as of its publication date. The information is subject to change without notice.

Contents


Contents

Chapter 1 Executive Summary 5

Executive overview ................................................................................................. 6

Terminology ............................................................................................................ 7

We value your feedback ......................................................................................... 8

Chapter 2 Product Overview 9

VMAX All Flash product overview ........................................................................ 10

VMAX SnapVX product overview ......................................................................... 11

VMAX SRDF product overview ............................................................................ 13

VMAX and T10-DIF protection from silent corruptions ......................................... 16

Chapter 3 Use Cases Considerations, Lab Configuration, And VMAX Device Identification 17

Considerations for Oracle database replications with SnapVX and SRDF .......... 18

Lab configuration .................................................................................................. 26

VMAX device identification on a database server ................................................ 29

Chapter 4 Restartable Database Snapshots 31

Restartable database snapshots overview and requirements .............................. 32

Creating restartable database snapshot ............................................................... 33

Mounting restartable snapshot ............................................................................. 34

Refreshing restartable snapshot ........................................................................... 37

Mounting restartable snapshot with a new DBID and file location ........................ 40

Restoring restartable snapshot ............................................................................. 45

Chapter 5 Recoverable Database Snapshots 48

Recoverable database snapshots overview and requirements ............................ 49

Creating recoverable database snapshot ............................................................. 51

Mounting recoverable snapshot ........................................................................... 52

Opening a recoverable database on a mount host ............................................... 54

Database integrity validation on a mount host ...................................................... 57

RMAN backup offload to a mount host ................................................................. 59

RMAN minor recovery of production database using snapshot ........................... 62

Production restore from a recoverable snapshot .................................................. 67

Instantiating an Oracle Standby Database using VMAX replications ................... 72

Chapter 6 Remote Replications with SRDF 81

Remote replications with SRDF overview and requirements ............................... 82

Contents


Initiating database replications with SRDF ........................................................... 84

Failover operations to the remote site .................................................................. 89

Creating remote restartable database snapshots ................................................. 97

Mounting a remote restartable snapshot .............................................................. 98

Refreshing remote restartable snapshot ............................................................ 102

Mounting remote restartable snapshot with a new DBID and file location ......... 105

Creating remote recoverable database snapshots ............................................. 105

Mounting remote recoverable snapshot ............................................................. 107

RMAN backup offload to a remote mount host ................................................... 107

Opening a remote recoverable database on mount host ................................... 107

Production restore from a remote recoverable snapshot ................................... 107

Chapter 7 Summary and Conclusion 113

Summary ............................................................................................................ 114

Chapter 1: Executive Summary


Chapter 1 Executive Summary

This chapter presents the following topics:

Executive overview .............................................................................................. 6

Terminology .......................................................................................................... 7

We value your feedback ....................................................................................... 8



Executive overview

Many applications are required to be fully operational 24x7x365, even as their data

continues to grow. At the same time, their RTO and RPO requirements are becoming

more stringent. As a result, there is an increasing demand for faster and more efficient

data protection.

Traditional backup methods cannot satisfy this demand because of the long duration and

host overhead required to create full backups. More importantly, during recovery, the

recovery process itself (transactions roll forward) cannot start until the initial image of the

database is fully restored, which can take many hours.

This has led many data centers to use storage snapshots for more efficient protection.

Dell EMC SnapVX snapshots take seconds to create or restore, regardless of database

size. During restore the data is made available immediately to the user, even while

remaining changes are copied in the background.

SnapVX eliminates both the problem of elongated copy time associated with backups, and

the huge RTOs associated with a database restore. These problems plague host-based

backup solutions designed for medium and large mission-critical databases.

SnapVX also allows fast creation of database replicas for testing, development, reporting,

staging, making gold copies, and more. All SnapVX replicas are consistent by default,

allowing the creation of replicas in seconds, while the production database is active.

Dell EMC SRDF extends SnapVX capabilities by creating synchronous, asynchronous, or

active/active remote replication solutions. With SRDF the database is replicated to a

remote storage array to provide additional protection within the data center, across data

centers, or even across continents.

With SRDF/Metro both source and target devices are writable and in sync, allowing an

Oracle extended RAC solution. SRDF/Metro changes data protection framework from

a failover to a continuous availability solution. It allows Oracle databases and

applications to continue operations throughout many possible disasters, including host,

network, SAN, or even storage unavailability.

This white paper is intended for database administrators, system administrators, storage

administrators, and system architects who are responsible for implementing Oracle

database backups and replications with VMAX All Flash storage systems. Readers should

have some familiarity with Oracle database and VMAX storage arrays.

Audience



Terminology

Table 1 explains important terms used in this paper.

Table 1. Terminology

Term Description

Oracle Automatic Storage Management (ASM)

Oracle ASM is a robust volume manager and a virtual file system that Oracle databases can use to store database files. ASM can be configured as a single server or clustered solution, can provide mirroring, allows for online storage migrations and much more.

Oracle Real Application Clusters (RAC)

Oracle RAC is a clustered version of Oracle database based on a comprehensive high-availability stack that can be used as the foundation of a database cloud system as well as a shared infrastructure, ensuring high availability, scalability, and agility for applications.

Restartable vs. Recoverable database

Oracle distinguishes between a restartable and recoverable state of the database. A restartable state requires all log, data, and control files to be consistent (see ‘Storage consistent replications’ in this table). For example, after a server crash, database shutdown abort, or a consistent snapshot, the database will be in a restartable state. Oracle can be simply started, performing automatic crash/instance recovery to the point in time just before the snapshot or crash took place. Recoverable state on the other hand requires re-applying transaction logs to the data files (often from the archive logs) before the database can be opened.

Rolling Disasters Rolling disasters is a term used when a first disaster disrupts normal database protection strategy, followed by a second disaster. For example, the dropping of remote replications followed by a later loss of a production site, or silent corruptions at the remote database followed by the loss of production site, etc.

RTO and RPO Recovery Time Objective (RTO) refers to the time it takes to recover a database after a failure. Recovery Point Objective (RPO) refers to any amount of data loss after the recovery completes, where RPO=0 means no data loss of committed transactions.

Storage consistent replications Storage consistent replications refer to storage replications (local or remote) in which the replica maintains write-order fidelity. That means that for any two dependent I/Os, such as log write followed by data update, either both will be included in the replica, or only the first. SnapVX replicas are always consistent and when performed correctly (include all log, data, and control files), the database replica is restartable.

Starting with Oracle database 11gR2, Oracle allows database recovery from storage consistent replications without the use of hot-backup mode (details in Oracle support note: 604683.1). The feature has become integrated with Oracle database 12c and is called Oracle Storage Snapshot Optimization.

Dell EMC ProtectPoint ProtectPoint is a product that directly integrates Data Domain with VMAX to provide a very fast backup and restore solution for Oracle databases, including those residing in ASM. It can leverage SnapVX technology to send just changed data directly to Data Domain with each database backup, or restore just the changes. It does not require host resources for either operation.

VMAX HYPERMAX OS HYPERMAX OS is the industry’s first open converged storage hypervisor and operating system. It enables VMAX to embed storage infrastructure services like cloud access, data mobility and data protection directly on the array. This delivers new levels of data center efficiency and consolidation by reducing footprint and energy requirements.

http://docs.oracle.com/database/122/BRADV/user-managed-flashback-dbpitr.htm#BRADV727



We value your feedback

Dell EMC and the authors of this document welcome your feedback on the solution and

the solution documentation. Contact [email protected] with your

comments.

Author: Yaron Dar

VMAX storage group (SG) A collection of host addressable VMAX devices. An SG can be used to present devices to hosts (LUN masking), manage grouping of devices for SnapVX and SRDF® operations, monitor performance, and more.

VMAX composite or consistency group (CG)

A collection of host addressable VMAX devices. A CG can manage consistent local replications with SnapVX, when the application storage devices are spread across multiple VMAX arrays. In this case it is referred to as a composite group. A CG can also manage consistent remote replications with SRDF, when the application storage devices are spread across multiple arrays or SRDF groups. In this case it is referred to as consistency group.

VMAX TimeFinder SnapVX TimeFinder SnapVX is the latest generation in TimeFinder local replication software, offering high-scale, in-memory, pointer-based, consistent snapshots.

VMAX SRDF SRDF (Symmetrix Remote Data Facility) is VMAX remote replication technology, which allows batch transfers of data, synchronous, asynchronous, active/active, and cascaded replications between multiple VMAX arrays. SRDF is tightly integrated with SnapVX to allow utilizing snapshots at the local or remote arrays.

mailto:[email protected]?subject=h14232.1%20VMAX%20Oracle%20Best%20Practices%20White%20Paper%20by%20Yaron%20Dar

Chapter 2: Product Overview


Chapter 2 Product Overview


VMAX All Flash product overview .................................................................... 10

VMAX SnapVX product overview ...................................................................... 11

VMAX SRDF product overview .......................................................................... 13

VMAX and T10-DIF protection from silent corruptions ................................... 16



VMAX All Flash product overview

The VMAX family of storage arrays is built on the strategy of simple, intelligent, modular

storage. It incorporates a Dynamic Virtual Matrix interface that connects and shares

resources across all VMAX engines, enabling the storage array to seamlessly grow from

an entry-level configuration into the world’s largest storage array. VMAX storage provides

the highest levels of performance, scalability, and availability, and features advanced

hardware and software capabilities.

In 2016, Dell EMC announced the VMAX All Flash 250F, 450F, and 850F storage arrays.

In May 2017, Dell EMC introduced VMAX 950F, which replaces the VMAX 450F and

850F, and provides higher performance at a similar cost.

VMAX All Flash arrays, as shown in Figure 1, provide a combination of ease of use,

scalability, high performance, and a robust set of data services that makes them an ideal

choice for database deployments.

Figure 1. VMAX All Flash 950F (left) and 250F (right) storage arrays

VMAX All Flash storage arrays provide the following benefits:

Ease of use—Uses virtual provisioning to create new storage devices in seconds. All

VMAX devices are thin, consuming only the storage capacity that is actually written to,

which increases storage efficiency without compromising performance. VMAX devices

are grouped into storage groups and managed as a unit for operations such as:

device masking to hosts; performance monitoring; local and remote replications;

compression; and host I/O limits. In addition, you can manage VMAX devices by using

Unisphere for VMAX, Solutions Enabler CLI, or REST APIs.

High performance—Designed for high performance and low latency, VMAX arrays

scale from one up to eight engines. Each engine consists of dual directors, where

each director includes two-socket Intel CPUs, front-end and back-end connectivity,

hardware compression module, InfiniBand internal fabric, and a large mirrored and

persistent cache.

VMAX All Flash

family

VMAX All Flash

benefits



All writes are acknowledged to the host as soon as they are registered with VMAX

cache1. Only later writes are written to flash, perhaps after multiple database updates.

Reads also benefit from the VMAX large cache. When a read is requested for data

that is not already in cache, FlashBoost technology delivers the I/O directly from the

back-end (flash) to the front-end (host). Reads are only later staged in the cache for

possible future access. VMAX also excels in servicing high bandwidth sequential

workloads that leverage pre-fetch algorithms, optimized writes, and fast front-end and

back-end interfaces.

Data services—Offers a strong set of data services. It natively protects all data with

T10-DIF from the moment data enters the array until it leaves (including replications).

With SnapVX and SRDF, VMAX provides many topologies for consistent local and

remote replications. Dell EMC ProtectPoint™ provides an integration with Data

Domain™, and Dell EMC CloudArray™ provides cloud gateways. Other VMAX data

services include Data at Rest Encryption (D@RE), Quality of Service (QoS)2 ,

compression, the “call home” support feature, non-disruptive upgrades (NDU), non-

disruptive migrations (NDM), and more. In virtual environments, VMAX also supports

VMware vStorage APIs for Array Integration (VAAI) primitives such as write-same and

xcopy.

NOTE: While the topic is not covered in this paper, you can also purchase VMAX as part of a

Converged Infrastructure (CI). For details, refer to Dell EMC VxBlock System 740, and Dell EMC

Ready Bundles for Oracle.

VMAX SnapVX product overview

TimeFinder SnapVX software delivers instant and storage-consistent point-in-time replicas

of host devices that can be used for purposes such as the creation of gold copies, patch

testing, reporting and test/development environments, backup and recovery, data

warehouse refreshes, or any other process that requires parallel access to, or

preservation of, the primary storage devices.

The replicated devices can contain the database data, Oracle home directories, data that

is external to the database (e.g. image files), message queues, and so on.

Figure 2 shows the basic operations of SnapVX snapshots.

1 VMAX All Flash cache is large (from 512 GB to16 TB, based on configuration), mirrored, and

persistent due to the vault module that protects the cache content in case of power failure and

restores the cache when the system comes back up.

2 Two separate features support VMAX QoS. The first relates to Host I/O limits that enable placing

IOPS or bandwidth limits on “noisy neighbor” applications (set of devices) such as test/dev

environments. The second relates to slowing down the copy rate for local or remote replications.

SnapVX

characteristics

https://store.emc.com/en-us/VxBlock-and-Vblock-Products/Dell-EMC-VxBlock-System-740-and-Vblock-System-740/p/Dell-EMC-VxBlock-System-740-and-Vblock-System-740

https://www.dellemc.com/en-us/solutions/ready/oracle.htm

https://www.dellemc.com/en-us/solutions/ready/oracle.htm



Figure 2. SnapVX operations

The following list describes the main SnapVX characteristics related to native3 snapshots:

SnapVX snapshots are always space-efficient as they are simply a set of pointers

pointing to the original data when it is unmodified, or to its own version of the data

when the source data is modified after the snapshot was taken. Multiple snapshots

of the same data utilize both storage and memory savings by pointing to the same

locations (tracks).

SnapVX, snapshots are targetless. That means that they can’t be used directly.

Instead, snapshots can be restored back to the source devices, or linked to

another set of target devices, matching in size to the source devices. The target

devices can be host-accessible. A re-link operation refresh the target devices with

a new snapshot data.

Snapshot operations are performed on a group of devices. This group is defined by

using either a text file with device IDs, a ‘device-group’ (DG), ‘composite-group’

(CG), a ‘storage group’ (SG), or simply specifying the devices. The recommended

way is to use a storage group (SG).

Snapshots are taken using the establish command. When establishing a snapshot,

provide a snapshot name, and optionally set an expiration time. Each snapshot

has a generation number which is incremented if the same snapshot name is used.

Generation 0 is always the latest snapshot. The snapshot time is listed together

with the snapshots, adjusted to the local time-zone.

Snapshot operations take seconds to complete, regardless of the size of the data.

For that reason, creating a snapshot of a large database is very fast. When a

snapshot is restored, the operation also takes seconds and as soon as it is done,

the source devices reflect the snapshot data. If necessary, a background copy will

take place and prioritize any requests to tracks that weren’t already copied.

Similarly, a link operation takes seconds to complete and when it is done, the target

devices reflect the snapshot data.

For legacy reasons, SnapVX link operation can use the option “-copy”, which

creates a full-copy clone. The outcome is that the original data is duplicated within

the array and the target devices point to the copy. This behavior is not

recommended with All Flash arrays due to the inefficiencies involved in the copy

3 SnapVX operating in emulation mode for legacy behavior is not covered.



operation and the additional capacity utilized by the copy. Full copies don’t improve

performance or data resiliency.

Defining phase: initially, when a snapshot is linked to target devices, accessing their

data is achieved indirectly by using the snapshot pointers. As part of the

background operation that takes place during link, the target devices’ pointers are

changed to point directly to the data. When this process ends, the snapshot is in a

defined state, and the target devices become a stand-alone image of the

snapshot data, regardless if the snapshot used ‘-copy’ or not, and regardless if the

snapshot is unlinked or terminated. Unless ‘-copy’ was used, both the source and

linked-target devices point to shared data, creating dedupe-like efficiencies.

Linked-target devices cannot restore any changes directly to the source devices.

Instead, a new snapshot can be taken from the target devices and linked back to

the original source devices. In this way, SnapVX allows an unlimited number of

cascaded snapshots.

Snapshots are protected. That means that even if a snapshot is restored and the

source devices are modified, or linked, and target devices are modified, the

snapshot is intact and can be re-used over and over with the same original data.

Optionally, snapshots can be secured. A secured snapshot can’t be terminated by

users before its retention period.

SnapVX snapshots are always consistent. That means that snapshot creation

always maintains write-order fidelity. This allows easy creation of restartable

database copies, or Oracle database recoverable backup copies based on Oracle

Storage Snapshot Optimization.

Source devices can have up to 256 snapshots that can be linked to up to 1,024

targets, providing very high scalability.

For more information on SnapVX, refer to: Dell EMC HYPERMAX OS TimeFinder local

replication technical note and the EMC Solutions Enabler CLI Guides.

VMAX SRDF product overview

The SRDF remote replications product family is trusted for disaster recovery and business

continuity. SRDF offers a variety of replication modes between VMAX storage arrays that

can be combined in different topologies, including two, three, and even four sites. SRDF

and SnapVX are tightly integrated to offer a combined solution for local and remote

replications.

SRDF technology is based on a few basic modes of operations:

SRDF Synchronous (SRDF/S) mode creates a solution with no data loss of

committed transactions. The target devices are an exact copy of the source devices

(production database), though in a read-only mode.

SRDF Asynchronous (SRDF/A) mode creates consistent replicas at unlimited

distances without a write response time penalty to the application. The target

devices are seconds to minutes behind the source devices (production database),

though consistent (“restartable”).

SRDF modes



http://www.emc.com/collateral/technical-documentation/h13697-emc-vmax3-local-replication.pdf

http://www.emc.com/collateral/technical-documentation/h13697-emc-vmax3-local-replication.pdf



SRDF Adaptive Copy (SRDF/ACP) mode allows bulk transfers of data between

source and target devices without write-order fidelity and without write performance

impact to source devices. Use SRDF/ACP during data migrations or to

resynchronize an SRDF target when many changes are owed to the target site. Set

SRDF/ACP to perform a one-time transfer, or to continuously send changes in bulk

until a specified delta between source and target remains. Once the delta is small

enough, change the SRDF mode to another mode, such as SRDF/S or SRDF/A.

SRDF/Metro is an extension of SRDF/S. With SRDF/Metro, devices from both

source and target arrays are in sync, and can perform both reads and writes

(active/active topology). To the host, SRDF/Metro makes the source and replicated

devices seem identical by giving them the same SCSI identity. As a result, the host

software (usually a cluster) can benefit from high-availability across distance,

avoiding most of the added complexity of setting up geo-clusters. If one of the

arrays becomes unavailable, the cluster software will automatically failover to the

surviving site (Oracle RAC reconfiguration) and database operations continue from

there without user intervention or downtime. SRDF/Metro is a great option to create

an Oracle Extended Cluster without added complexity to the cluster configuration.

SRDF devices are configured in groups, and managed together as follows:

Create a relationship between source and target replicated devices by associating

the local and remote devices with a dynamic SRDF group number and SRDF ports

on each array.

The source devices in the SRDF group are called R1 devices, and the target

devices are called R2 devices.

Once the SRDF device pairs are set, they can be managed using a text file

specifying the list of devices, a ‘device-group’ (DG), ‘composite/consistency-group’

(CG), or a ‘storage group’ (SG). The recommended way is to use a storage group

(SG) when working with a single array, or a consistency group (CG) when the R1 or

R2 devices are spread across multiple arrays. One exception is that in order to

enable SRDF/S consistency (see next section), a CG is required.

Sync (SRDF/S), Async (SRDF/A), or Metro SRDF modes all maintain the consistency of

the replicated devices. For example, if the replication is terminated due to the source site

becoming unavailable for some reason, the database can simply restart from the target

devices (or continue operations from SRDF/Metro target devices). The following are

considerations around SRDF consistency:

An SRDF Consistency Group is an SRDF group for which consistency has been

enabled. Consistency can be enabled for either Synchronous or Asynchronous

replication modes.

An SRDF consistency group maintains write-order fidelity (also called dependent-

write consistency) to make sure that the target devices always provide a restartable

replica of the source application.

NOTE: Even when consistency is enabled the remote devices may not yet be consistent

while SRDF state is sync-in-progress. This happens when SRDF initial synchronization is

taking place before it enters a ‘consistent’ or ‘synchronized’ replication state.

SRDF groups

SRDF and

consistency

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/cwlin/about-oracle-extended-clusters.html#GUID-C12A6024-A46D-48F5-A443-E08C28AFC716



Enabling SRDF consistency is important because it means that if a single device

in a consistency group can’t replicate, then the whole group will stop replicating to

preserve the target devices’ consistency. For that reason, it is recommended that

consistency is enabled not only for SRDF/A, but also for SRDF/S. Enabling

consistency for SRDF/S requires the use of a CG.

Combine multiple SRDF groups set in SRDF/A mode within a single array or across

arrays. Such grouping of consistency groups is called multi-session consistency

(MSC). MSC maintains dependent-write consistent replications across all the

participating SRDF groups.

SRDF session shows the state of the replication between the source and target devices.

The following are considerations for SRDF sessions:

An SRDF session is created when replication starts between R1 and R2 devices in

an SRDF group.

An SRDF session can establish replication between R1 to R2 devices. R1 and R2

devices require a full copy only at the first establish operation. Any subsequent

establish (for example, after an SRDF split or suspend) will be incremental, only

passing changed data.

An SRDF session can restore the content of R2 devices back to R1. Restore will

be incremental, moving only changed data across the links. SnapVX and SRDF can

restore in parallel. For example they can be used to bring back a remote backup

image.

Except for SRDF/Metro, during SRDF replications, the devices to which data is

replicated are write-disabled (read-only).

An SRDF session can be suspended, temporarily halting replication until a resume

command is issued

An SRDF session can be split, which not only suspends the replication but also

makes the R2 devices read-writable.

An SRDF checkpoint command will not return the prompt until the content of the

R1 devices has reached the R2 devices. This option helps in creating remote

database backups when SRDF/A is used.

An SRDF swap will change R1 and R2 personality, so replication for the session

can switch direction.

An SRDF failover makes the R2 devices writable. R1 devices, if still accessible, will

change to Write_Disabled (read-only). The SRDF session will be suspended and

application read-write operations will proceed on the R2 devices.

An SRDF failback copies changed data from R2 devices back to R1, and makes

the R1 devices writable. R2 devices are made Write_Disabled (read-only).

An SRDF update copies changed data from the R2 devices back to R1, though it

leaves the R2 devices operational. SRDF update is often used after a failover, once

the R1 site becomes available again, but it needs to catch up with the changes on

the R2.

SRDF replication sessions can go in either direction (bi-directional) between the

two arrays, where different SRDF groups can replicate in different directions.

SRDF session



VMAX and T10-DIF protection from silent corruptions

Silent corruptions can be introduced to the Oracle database data anywhere in the stack,

from the time the data is created in server memory until it reaches the VMAX storage

(writes) or the other way around (reads). Silent corruptions can be caused by hardware or

software bugs, or by disasters such as someone pulling an FC cable in the middle of an

I/O, or hard server crashes.

As a result, the Oracle block structure or data can be either incorrect, or incomplete, and

often neither the database nor the user will know about it until a database read to this

block takes place, which can be after minutes, hours, or days, or perhaps even longer.

To avoid silent corruptions VMAX utilizes a SCSI standard called T10-PI (Protection

Information), which is sometimes referred to as T10-DIF (Data Integrity Field). With DIF,

the 512 bytes disk sector geometry is extended to 520 bytes, adding 8 bytes for protection

of each such block. The protection information includes three parts: a 16-bit guard tag,

which is used for CRC check, a 32-bit reference tag which is used to validate the correct

block address (location) of the block, and an application tag, which can be used in

different ways but is currently mostly ignored.

Internally, VMAX utilizes DIF extensively from the moment the I/O arrives to the array, and

while it goes through the different emulations, such as front-end, memory, and back-end

(disk). It is important to realize that VMAX uses DIF for all replications, including local and

remote, to validate that the data is replicated accurately.

Externally, VMAX can work with other layers that support external DIF, such as the HBAs,

the Linux kernel, and even Oracle ASM. In this way, the protection is extended between

the host and the VMAX storage for all reads and writes, including storage replications.

For more information about Oracle involvement in supporting external DIF see:

https://oss.oracle.com/~mkp/docs/OOW2011-DI.pdf

For supported configurations with external DIF refer to Dell EMC eLab Navigator, VMAX

All Flash/VAMX3 Features Simple Support Matrix.

Protection from

silent

corruptions

https://en.wikipedia.org/wiki/Data_Integrity_Field

https://oss.oracle.com/~mkp/docs/OOW2011-DI.pdf

https://elabnavigator.emc.com/vault/pdf/VMAX_All_Flash_VMAX3_Features.pdf?key=111

Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification


Chapter 3 Use Cases Considerations, Lab Configuration, And VMAX Device

Identification


Considerations for Oracle database replications with SnapVX and SRDF ... 18

Lab configuration ............................................................................................... 26

VMAX device identification on a database server ........................................... 29



Considerations for Oracle database replications with SnapVX and SRDF

VMAX All Flash SnapVX allows up to 256 snapshots per source device with minimal

cache and capacity impact. SnapVX minimizes the impact of production host writes by

using intelligent redirect-on-write and async-copy-on-first-write. Both methods allow

production host I/O writes to complete without delay due to background data copy while

production data is modified and the snapshot data preserves its point-in-time consistency.

If snapshots are used as part of a disaster protection strategy then the frequency of

creating snapshots can be determined based on the RTO and RPO needs.

For a “restart” solution where no roll-forward is planned, take snapshots at very

short intervals (minutes) to ensure that RPO is limited to that interval. For example,

if a snapshot is taken every 15 minutes, there will be no more than 15 minutes of

data loss if it is necessary to restore the database without recovery.

For a “recovery” solution, frequent snapshots ensure that RTO is short as less data

will need recovery during roll forward of logs to the current time. For example, if

snapshots are taken every 15 minutes, roll forward of the data from the last

snapshot will be much faster than rolling forward from a nightly backup.

SnapVX snapshots cannot be directly accessed by a host. They can be either restored to

the source devices or linked to up to 1024 sets of target devices. When linking any

snapshot to target devices, SnapVX allows using the copy or no-copy option where no-

copy is the default.

No-copy option: No-copy linked targets remain space efficient by sharing pointers with

production devices and the snapshot. Only changes to either the linked targets or

production devices consume additional storage capacity. It is important to know that no-

copy linked targets retain their data even after they are unlinked. This requires them

to first be in ‘defined’ stage, meaning that the target devices’ pointers are pointing directly

to the storage data and no longer using indirect pointers via the snapshot.

Copy option: Alternatively, the linked-targets can have their own full copy of the data,

and will not be sharing pointers with the production devices and snapshot. The copy

option is not recommended for VMAX All Flash because it consumes a lot more capacity

without providing performance or resiliency advantages over no-copy linked targets. It is

mainly used for legacy operations or with products such as ProtectPoint, where the target

devices are actually Data Domain encapsulated devices.

SnapVX creates consistent snapshots by default, which are well-suited for a database

restart solution. Simply open a restartable database replica. It will then perform crash or

instance recovery, just as if the server rebooted or the DBA performed a shutdown abort.

To achieve a restartable solution, all data, control, and redo log files must participate in

the consistent snapshot. Archive logs are not required and are not used during database

crash/instance recovery. Restartable database snapshots are covered in Chapter 4.

SnapVX can also create recoverable replicas. A recoverable database replica can perform

database recovery to a desired point in time using archive and redo logs. Oracle

Database 12c enhanced the ability to create database recoverable solution based on

Number of

snapshots,

frequency, and

retention

Copy vs. no-

copy snapshot

target

Oracle database

restartable,

recoverable, and

hybrid

snapshots



storage replications by leveraging storage consistency instead of hot-backup mode. This

feature of Oracle Database 12c is called: Oracle Storage Snapshot Optimization.

For a recoverable snapshot that will be recovered on the production host and therefore

relies on the available redo logs and archive logs, the snapshot can include just the data

files. However, if the snapshot will be used on another host (such as when using linked

targets and presenting them to a mount host), take an additional snapshot of the archive

logs, following the best practice described in Chapter 5.

Redo logs are not required for a recoverable snapshot and are not part of a roll forward,

since the redo logs in the snapshot will never include the latest transactions. However, the

redo logs may optionally be included so that the DBA doesn’t have to create the +REDO

ASM disk group from scratch on the mount host. Redo logs can also be used for creating

a hybrid replica as explained in the next paragraph.

To create a hybrid replica that can be used for both recovery and restart, include all data,

control, and redo logs in the first snapshot (or SRDF session), and archive logs in a

second snapshot (or SRDF session), following the best practice for recoverable database

replicas.

If a restartable solution is chosen, the archive log replica is not needed, but can be used

on the mount host if the DBA wants a +FRA ASM disk group identical to production’s

available for after the database restart took place. If a recoverable solution is chosen, the

replica of the redo logs will not be needed (especially if restoring back to production, to

avoid overwriting production’s redo logs). However, on a mount host, the DBA may want a

+REDO ASM disk group identical to production’s available for after the database recovery

takes place. The use cases in this paper always create a snapshot with both +REDO and

+DATA included to allow the greater flexibility of the hybrid replica.

Oracle Recovery Manager (RMAN) is tightly integrated with the Oracle database. It can

perform host-based backups and restores on its own, but it can also work very effectively

with storage snapshots.

RMAN backups can be performed from VMAX snapshots that are mounted to a mount

host, sending the backup to a target outside the VMAX, such as Data Domain. Since

RMAN doesn’t depend on which host it operates from, it can later restore that backup

directly to production.

RMAN incremental backups can continue to leverage Oracle Block Change Tracking,

even if the backup was offloaded to the mount host. RMAN can also use the mount host

to validate the database integrity.

Restore optimizations are realized when combining RMAN with storage snapshots. Once

we restore a recoverable snapshot to production, RMAN can use it to finish the database

recovery operations on that image, combining the power of RMAN with storage

snapshots.

RMAN can also leverage the storage snapshot as a copy of production. Mount the

snapshot to the production host with a new location (for instance, a new ASM disk group

name). Once RMAN catalogs it, the snapshot can be used to quickly recover any

corruptions in the production database.

RMAN and

storage

replications

https://docs.oracle.com/database/121/BRADV/osbackup.htm#BRADV90019



Typically, the DBAs execute Oracle SQL and RMAN commands and storage admins

execute storage management operations (such as SnapVX, or SRDF commands). This

type of role-management and security segregation is common in large organizations

where each group manages their respective infrastructure with a high level of expertise.

There are reasons to merge these roles to some extent. For example, allow the database

backup operator to have controlled access to both Oracle SQL and SnapVX commands

so they can create their own backups, leveraging storage snapshots. Use VMAX Access

Controls (ACLs) to allow the backup manager limited control of a defined set of devices

and operations, tied to a specific backup host.

It goes beyond the scope of this paper to discuss the configuration and usage of VMAX

ACLs; however, it is important to mention that Solutions Enabler can be installed for a

non-root user, and together with ACLs, allows the storage admins to offload such backup

operations to the backup admin.

When performing media recovery, Oracle is looking for either the end hot-backup mode

marker in the archive logs, or for the user to supply the ‘snapshot-time’ during the media

recovery, which is the time the snapshot of the data files was created.

View the snapshot time by listing the snapshots. However, keep in mind that the storage

management shows the times adjusted to its clock and time-zone. If it exactly matches the

production database server’s clock (e.g. when using NTP), then the listed times can be

used as the ‘snapshot time’. Alternatively, during the backup, include the database server

time in the snapshot name.

# symsnapvx -sg database_sg list

Storage Group (SG) Name : database_sg

SG's Symmetrix ID : 000197700048 (Microcode Version: 5977)

----------------------------------------------------------------------------

Sym Num Flags

Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp

----- -------------------------------- ---- ------- ------------------------

00067 database_20171025-160003 1 .X.. .. Wed Oct 25 16:00:03 2017

database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017


database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017

...

During the media recovery Oracle inspects the data file headers (list them using the

following query), and compares the last checkpoint time to the snapshot-time.

SQL> select name, checkpoint_change#, to_char(checkpoint_time,

'DD.MM.YYYY HH24:MI:SS') checkpoint_time from v$datafile_header;

NAME CHECKPOINT_CHANGE# CHECKPOINT_TIME

Storage

snapshots host

user

Snapshot time

and clock

differences



--------------------------------------------- ------------------ -------------------

+DATA/SLOB/DATAFILE/system.257.953030737 27380508 25.10.2017 15:41:25

+DATA/SLOB/DATAFILE/sysaux.258.953030739 27380508 25.10.2017 15:41:25

+DATA/SLOB/DATAFILE/sys_undots.259.953030755 27380508 25.10.2017 15:41:25

+DATA/undotbs1.dbf 27380508 25.10.2017 15:41:25

+DATA/undotbs2.dbf 27380508 25.10.2017 15:41:25

+DATA/SLOB/DATAFILE/slob.263.953031317 27380508 25.10.2017 15:41:25

Oracle expects that the last checkpoint took place prior to the snapshot time. If the

checkpoint time is later than the snapshot time Oracle will produce the following error:

ORA-19839: snapshot datafile checkpoint time is greater than snapshot

time

To avoid this situation, make sure that the snapshot time is accurate and fits the clock and

time-zone of the database server from which the snapshot was created.

Tests show that in some cases (unrelated to storage snapshots), the file headers’

checkpoint time is a minute or two ahead of the actual database server clock (or

‘sysdate’). While this seems like an Oracle bug (a case was opened), and that the chance

of a database checkpoint occurring just before a snapshot is slim, it is described here in

case it occurs at a customer site.

If you receive ORA-19839 message, and the file headers’s checkpoint_time shows a

minute or two ahead of the snapshot time, use the checkpoint_time from the file

headers as the ‘snapshot time’ in the media recovery. It only means that Oracle will

require slightly more recovery before the database can be opened.

Number of ASM disk groups

The storage design principles for Oracle on VMAX All-Flash are documented in the white

paper: Dell EMC VMAX All Flash storage for mission-critical Oracle databases. Below are

a few key points:

Number of ASM Disk Groups:

Define at least three ASM disk groups (and matching VMAX storage groups) for

maximum flexibility: +DATA (data and control files), +REDO (redo logs), and +FRA

(archive logs). A parent storage group is recommended, which includes both

+DATA and +REDO (and used for restartable replicas creation).

The separation of data, redo and archive log files allows backup and restore of only

the appropriate file types at the appropriate time. For example, Oracle backup

procedures require the archive logs to be replicated at a later time than the data

files. Also, during restore, if the redo logs are still available on the production host,

we can restore only data files without overwriting the production’s redo logs.

ASM, ASMlib, and ASM Filter Driver

Another aspect of ASM is that it can be used without other drivers, pointing directly to the

storage devices. It can utilize ASMlib, which is an optional driver that places labels on the

storage devices and then ASM uses these labels when creating the disk groups, or it can

Oracle ASM

considerations

http://www.emc.com/collateral/white-papers/h14557-vmax3-all-flash-storage-for-oracle-database-wp.pdf



use ASM Filter Driver (AFD), which also provides its own device labels (and other

functionality).

From the VMAX storage replications’ perspective (i.e. SRDF and SnapVX), it doesn’t

matter whether ASM, ASMlib, or AFD are used. What’s important is to be consistent. For

example, if AFD is used on production, it should also be used on the mount host as well.

The examples in this paper use AFD. If ASMlib is used instead, then additional operations

will be required on a mount host that ties into ASMlib specifically. For example, ASMlib

disk scan and ASMlib labels rename when necessary.

Oracle RAC and +GRID ASM disk group

When Oracle RAC is used it is recommended to use a separate ASM disk group for Grid

infrastructure (GI): for example, +GRID. The +GRID ASM disk group should not contain

user data. For high-performance databases, use normal redundancy (host-based

mirroring) only for this ASM disk group, as Oracle would create three quorum files instead

of just one if external redundancy was used (no host-based mirroring). All other ASM disk

groups should use external redundancy as VMAX provides efficient RAID protection.

Since +GRID does not contain any user data, and since GI setup contains host specific

information, do not include +GRID in the replications (SnapVX or SRDF). Instead, pre-

install GI on the mount host(s) with its own +GRID. When the replicated ASM disk groups

are made visible to the mount host(s), they can simply be mounted into the existing

cluster.

If production database was using RAC, on the mount host(s), start the database in either

clustered or non-clustered mode. The reason is that Oracle RAC uses shared storage and

requires all data to be visible to all nodes, and therefore it will be part of the replication,

regardless of how the database is started on the target.

If production is not clustered then there will not be a readily available cluster waiting on

the mount host to mount the replicated ASM disk groups. Instead, if it isn’t already

running, start the ASM software stack once the replica is made available using ‘srvctl

start has’ command.

Flashback logs and storage replications

Flashback Database is an optional feature of Oracle that returns the database to a point in

time in the past. It requires that the database has no physical corruptions. It also requires

all the flashback and archive logs from the present time to the flashback time. Flashback

information relies on the production database’s control file; therefore, if the control file is

recovered or recreated, the feature can’t be used.

Oracle uses Fast Recovery Area (FRA) as the location of the flashback log. In this paper

we created an ASM disk group called +FRA for the Fast Recovery Area and we assume

that the archive logs are sent there. While archive log destination can be any ASM disk

group (with default to the FRA), flashback logs always go to the FRA. Typically, a very

large capacity is required for flashback logs, even with a relatively small retention time.

In relationship to storage replications, Flashback Database can only be used with a

restartable solution, since in a recoverable solution a backup controlfile is assumed. If

Flashback Database feature is enabled, and the DBA wants to be able to use it on the

mount host, then the previous requirement for a restartable replica of all data, redo, and



control files has to be extended to include also the flashback logs in the same storage

replica (SnapVX or SRDF). This will make sure that the latest database blocks’ past

images are consistent with the data files. That means that restartable snapshots have to

include: +DATA, +REDO, and +FRA (and not only +DATA and +REDO).

In that case, the DBA may want to consider separating the ASM disk groups of the archive

logs from the flashback logs. For example, create a disk group called +ARCH and send

the archive logs there while flashback logs go to +FRA. By doing so, the recoverable use

cases described in this paper will be possible, as they require a snapshot of the archive

logs to occur after a snapshot of the data files.

SRDF and consistency

As mentioned in the SRDF product overview, it is recommended for an SRDF/A solution

to always use Consistency Enabled. This ensures that if a single device cannot replicate,

the entire SRDF group will stop replicating, maintaining a consistent database replica on

the target devices. To enable SRDF/A consistency using an SG, use the command:

symrdf -sid <SID> -rdfg <RDF group> -sg <SG> enable

NOTE: When more than one SRDF group participates in the replications, a CG has to be created

and used to enable consistency.

In a similar way, SRDF/S can also benefit from enabling consistency. Unlike SRDF/A,

when a single SRDF group is used, SRDF/S does not allow enabling consistency at the

SG level and requires a CG instead. For simplicity, the SRDF/S examples in this paper

manage replications using SG, however, it is a best practice to enable consistency for

both SRDF/S and SRDF/A. In the SRDF/S case, it means using a CG to manage the

replications, even if a single SRDF group is used.

SRDF replications and ASM Flash Recovery Area (FRA)

Natively, SRDF is a restart solution. Since database crash recovery never uses archive

logs, there is no need to include a +FRA ASM disk group (archive logs default location) in

the SRDF replications. However, there are a few reasons why +FRA could be included:

If Oracle backup is offloaded to the target storage array, the archive logs are

needed there. In this case, the archive logs can use the same or a different SRDF

group. However, the replication mode (Sync or Async) should match that of the

data files. That means that if database_sg is replicated in Sync mode, fra_sg should

also be replicated in Sync mode, so that regardless of where the database is

started (local or remote), all the appropriate archive logs are available.

If the DBA wants a +FRA ASM disk group that is identical to the production

database’s at the target site, this can be accommodated. While +FRA can be

created separately on the target array (saving replication bandwidth), the DBA may

prefer to prevent any differences by simply replicating the production database’s

+FRA.

As discussed earlier, if Oracle Flashback Database will be used at the remote site,

then both flashback and archive logs need to be replicated together with the other

database files. The DBA can still decide to keep the archive logs in another ASM

disk group to allow for remote backups.

Remote

replications

specific

considerations



SRDF replications and Oracle temp files

Similarly to archive logs, temp files are not required for either a recovery or restart

replication solution. As such, if they were separated to their own ASM disk group, it

doesn’t have to be included in the replications, saving bandwidth. However, in most cases

temp files are mixed with other database files and ASM disk groups. In that case, they’ll

be replicated together with the other database files.

Another reason to replicate the temp files is that although they don’t contain user data or

participate in a recovery or restart solution, Oracle will be looking for them when it

attempts to open the database at the target site. So that database operations are not

delayed, it is best to include them in the replications together with database_sg.

SRDF replications of multiple databases, message queues, and external files

Generally, a database is not operating in a silo. It has relationships to other databases

(two-phase commit, or loosely-coupled, often created by applications’ code), message

queues to other databases, external files (for example, images and other media or

unstructured content), and more.

One of SRDF’s strengths is its ability to create a consistency group across a group of

such databases, external files, and message queues, as long as they all reside on VMAX

storage devices. This is very powerful because in a true disaster, not all systems crash at

exactly the same time. As a result, solutions that can’t maintain consistency across

databases, may spend a huge amount of time after the disaster reconciling dependencies,

transactions owed and their order, and message queues between databases before the

database can be accessed by users.

SRDF consistency groups can include all such related databases, applications, message

queues, and external files, making all these related components consistent with each

other, so after a disaster, simple restart operations take place and user operations resume

quickly.

SRDF replications of Oracle Home

Normally, Oracle Home contains the configuration files, binaries, and database logs that

are applicable to the servers they are installed on. For that reason, Oracle Home is not

often included in the replications. However, some DBAs may prefer to include the Oracle

Home (and perhaps even Grid Infrastructure Home) due to the many patches they may

have applied and the desire to have those patches available at the target site if they need

to move operations there.

In that case, Oracle Home should be installed on a VMAX device and should be included

in the replications. It can have its own SRDF group or can be joined with the data files.

SRDF ‘SyncInProg’ and remote gold copy

It is always recommended to plan for a database gold copy at the SRDF target site as a

safety measure for rolling disasters.

For example if SRDF replications were interrupted (planned or unplanned) and changes

accumulated on the source array, once the synchronization resumes and until the target



array is in synchronized state (SRDF/S) or consistent state (SRDF/A), the target database

image is not usable. For that reason it is a best practice before such

resynchronization starts, to take a gold copy snapshot at the target site. This gold

copy preserves the last valid remote image of the database as a safety measure until the

target is in sync again.

Do SnapVX or SRDF ‘replicate corruptions’?

Sometimes, in order to push a log-shipping agenda, a vendor will claim that storage

replications such as SRDF or SnapVX replicate corruptions from the production database

to the replication target, claiming that their log-shipping solution doesn’t replicate

corruptions. It is good to understand what’s behind such a claim and assess its

truthfulness.

From a storage replications perspective, both SnapVX and SRDF replicate the source

data to the target accurately. As mentioned in the section VMAX and T10-DIF protection

from silent corruptions, VMAX uses T10-DIF. That means that the I/O is only vulnerable to

corruptions while it moves from the source database until it reaches the storage. After that

it is protected by VMAX, including during replications. In other words, only one

vulnerability path exists. If external T10-DIF is added, even that vulnerability path is

eliminated.

A log shipping solution has two active databases – the source, and the standby (target).

Although the log records shipped to the standby are validated before they are applied,

once they are applied, the database changes are going through the I/O path just like on

the source database. In other words, a log shipping solution has two vulnerability paths

where silent corruptions may occur. Of course if VMAX is used for both, external T10-DIF

can be enabled for both.

Therefore, we can say that without external T10-DIF (which makes both solutions

resilient), a log shipping solution has twice the vulnerability of VMAX replications. The

slight difference is that a VMAX replica will be identical to the source (including pre-

existing corruptions, if there are any), where a log shipping replica can introduce new

corruptions, due to the I/O exposure while the standby writes to its data files.

Silent data corruptions are not discovered until the data is read, and that can be a long

time after it was written. In a ‘rolling disasters’ case, first corruptions are introduced to the

replication target (storage replication or log shipping replication), and then the production

database is lost. To avoid this, with either replication technology it is a good practice to

check for database corruptions periodically. With storage replications, either the source or

target databases can be validated, as described in Database integrity validation on a

mount host. In the log shipping solution, both the production and standby databases

should be tested (as different silent corruptions may exist in each).

To summarize, VMAX replications can actually be considered safer than log shipping

replications. The choice of storage replications or log shipping should be driven by

business needs. Very often, both types are used in parallel. Remember that a big

advantage for SRDF is its ability to create a consistency group across a group of related

databases and applications, including external files and message queues. The ability to

perform restart operations after a disaster where everything is consistent, instead of

reconciling between databases, saves time and reduces complexity.



3-sites SRDF replications

Besides the obvious 2-site topologies, SRDF can also operate in cascaded or STAR

mode, allowing consistent replications among 3 locations. From an Oracle perspective,

the best practices for restart or recovery don’t change. While discussing 3-site solutions is

beyond the scope of the paper, the same best practices discussed here apply to 2-site or

3-site SRDF deployments.

Lab configuration

The following tables show the environment used to test and demonstrate the use cases

described in the following chapters. Table 1 shows the VMAX storage environment, Table

2 shows the host environment, and Table 3 shows the Oracle ASM and VMAX storage

groups configuration.

Table 1. Storage environment

Configuration aspect Description

Storage array (local) VMAX 950F (048) single V-Brick

Storage array (remote) VMAX 950F (047) single V-Brick

HYPERMAX OS 5977.1125

Flash drives in each array 32 x SSD - RAID5 (7+1)

Table 2. Host environment

Configuration aspect Description

Oracle Oracle Grid Infrastructure and Oracle Database releases 12.2.0.14

Production hosts 2 x Dell EMC PowerEdge R730, 28 cores, 128GB RAM Red Hat Enterprise Linux 7.2

Mount hosts 2 x Cisco UCS C240M3, 20 cores, 96GB RAM Oracle Enterprise Linux 7.1

Multipathing PowerPath 6.1

Volume Manager Oracle ASM 12.2.0.1

Table 3. Oracle ASM and VMAX storage groups configuration

Database ASM Size Storage Groups (Prod)

Storage Groups (Mount)

Production: 2-node RAC release 12.2.0.1

+GRID 3 x 40GB grid_sg grid_mount_sg

+DATA 16 x 100GB data_sg (child) data_mount_sg (child)

+REDO 8 x 50GB redo_sg (child) redo_mount_sg (child)

4 Any Oracle 12c feature or best practice in this paper is applicable to both Oracle database release

12.1 as well as 12.2. Hot-backup mode based solutions fit older Oracle releases as well.

Lab

configuration for

use cases



Database ASM Size Storage Groups (Prod)

Storage Groups (Mount)

Mount: 2-node RAC release 12.2.0.1

DB Name: slob

DB Size: 1.2 TB

(+DATA & +REDO)

database_sg (parent)

database_mount_sg (parent)

+FRA 1 x 250GB fra_sg fra_mount_sg

Figure 3 shows the overall test configuration used for the local replications use cases.

Figure 3. Oracle local replications test environment

A 2-node RAC Oracle 12.2 ran on the local array, VMAX 950F (SID 048). ASM was

configured with a +GRID disk group for grid infrastructure with normal redundancy and no

user data. As such, it was not part of the replications. The other ASM disk groups

(+DATA, +REDO, +FRA) used external redundancy and matched with VMAX storage

groups (data_sg, redo_sg, and fra_sg). A parent storage group database_sg contained

both data_sg and redo_sg.

The +GRID ASM disk group were pre-configured on the VMAX devices of the mount

hosts, and were not based on the production snapshots. Once the snapshot target

devices were made available to the mount host, their ASM disk groups were simply

mounted to the pre-configured cluster.

Figure 4 shows the overall test configuration used for the remote replications test cases.



Figure 4. Oracle remote replications test environment

A 2-node RAC Oracle 12.2 was running on the local array, VMAX 950F (SID 048). ASM

was configured with a +GRID disk group for Grid Infrastructure (GI) with normal

redundancy and no user data. As such, it was not part of the replications. The other ASM

disk groups (+DATA, +REDO, +FRA) used external redundancy and matched with VMAX

storage groups (data_sg, redo_sg, and fra_sg). A parent storage group database_sg

contained both data_sg and redo_sg.

The remote array, VMAX 950F (SID 047) was configured with its own +GRID ASM disk

group. Once the SRDF target devices or the remote snapshot target devices were made

available, ASM was simply able to mount these disk groups and use them.

There are two storage management hosts – local and remote. While SRDF is connected,

each storage management host can issue commands to either local or remote arrays.

However, it is best to have a storage management host prepared in each site in case a

disaster occurs and the links between the arrays are not operational.

NOTE: To make use of Solutions Enabler CLI, a storage management host (or vApp) is required.

However, if only Unisphere or REST APIs are used then the VMAX embedded management

module can be used.

Two Linux aliases are used in the examples to change the Oracle user environment

variables between the database and ASM.

‘TODB’ is a Linux alias that sets the Oracle user environment variables of

ORACLE_HOME and ORACLE_SID to those of the database.

alias TODB='export ORACLE_BASE=$DB_BASE; export ORACLE_HOME=$DB_HOME;

export ORACLE_SID=$DB_SID; export PATH=$BASE_PATH:$DB_HOME/bin'

Linux aliases



‘TOGRID’ is a Linux alias that sets the Oracle user environment variables of

ORACLE_HOME and ORACLE_SID to those of the Grid Infrastructure (ASM).

alias TOGRID='export ORACLE_BASE=$GRID_BASE; export

ORACLE_HOME=$GRID_HOME; export ORACLE_SID=$GRID_SID; export

PATH=$BASE_PATH:$GRID_HOME/bin'

We used TOGRID or TODB aliases prior to executing commands associated with ASM or

the database.

VMAX device identification on a database server

Sometimes it is necessary to match VMAX devices IDs from a storage group (SG) to the

device presentation on the production or mount database servers; for example, when

creating text files with device names in order to perform an ASM disk group rename.

In the following example we want to match data_mount_sg VMAX devices IDs with the

devices on the database host.

First, identify the storage device IDs that are part of data_mount_sg. To find the device

IDs of the storage group, use the Unisphere interface, or use the following command from

the storage management host:

# symsg show data_mount_sg

...

Devices (24):

{

----------------------------------------------------------------

Sym Device Cap

Dev Pdev Name Config Attr Sts (MB)

----------------------------------------------------------------

000DA N/A TDEV RW 102401

000DB N/A TDEV RW 102401

000DC N/A TDEV RW 102401

...

To match storage and host devices you can use the scsi_id command, a PowerPath

command, or an inq (inquiry) command, as explained in these sections.

The scsi_id Linux command is part of ‘sg3-utils’ module. If that module is not already

installed on the database servers, you can add it (‘yum install sg3-utils’). In the following

list, the three digits of the storage serial ID (for example, 048) are followed by the device

ID (for example, 066, 075, etc). The command can be adjusted to /dev/mapper/*p1 if you

are using native multipathing.

[root@dsib0144 download]# for i in `ls -1 /dev/emcpower*1`; do echo

$i; scsi_id --page 0x80 --whitelisted --device=$i ;done

/dev/emcpowera1

SEMC SYMMETRIX 700048066000

/dev/emcpoweraa1

Using scsi_id

command




/dev/emcpowerab1


/dev/emcpowerac1

...

PowerPath commands can be executed on the database server to list the host device

presentation for each storage device, as shown below:

[root@dsib0144 ~]# powermt display dev=all

Pseudo name=emcpowera

Symmetrix ID=000197700048

Logical device ID=00066

Device WWN=60000970000197700048533030303636

state=alive; policy=SymmOpt; queued-IOs=0

...

Pseudo name=emcpoweraa

Symmetrix ID=000197700048

Logical device ID=00075

Device WWN=60000970000197700048533030303735

...

Another option is to download the free, stand-alone Inquiry binary from the Dell EMC ftp

website and use it on the database server to list the devices, as shown below:

[root@dsib0144 download]# ./inq.LinuxAMD64 -no_dots -showvol -

f_powerpath

...

-----------------------------------------------------------------------------------

DEVICE :VEND :PROD :REV :SER NUM :Volume :CAP(kb)

-----------------------------------------------------------------------------------

/dev/emcpowera :EMC :SYMMETRIX :5977 :4800066000 : 00066: 20972160

/dev/emcpowerb :EMC :SYMMETRIX :5977 :480006f000 : 0006F: 104858880

/dev/emcpowerc :EMC :SYMMETRIX :5977 :4800065000 : 00065: 20972160

/dev/emcpowerd :EMC :SYMMETRIX :5977 :480006e000 : 0006E: 104858880

/dev/emcpowere :EMC :SYMMETRIX :5977 :4800064000 : 00064: 20972160

/dev/emcpowerf :EMC :SYMMETRIX :5977 :480006d000 : 0006D: 104858880

/dev/emcpowerg :EMC :SYMMETRIX :5977 :4800082000 : 00082: 157286400

/dev/emcpowerh :EMC :SYMMETRIX :5977 :480006c000 : 0006C: 104858880

/dev/emcpoweri :EMC :SYMMETRIX :5977 :4800081000 : 00081: 157286400

...

Using PowerPath

commands

Using Inq

command

ftp://ftp.emc.com/pub/symm3000/inquiry/

ftp://ftp.emc.com/pub/symm3000/inquiry/

Chapter 4: Restartable Database Snapshots


Chapter 4 Restartable Database Snapshots


Restartable database snapshots overview and requirements ....................... 32

Creating restartable database snapshot .......................................................... 33

Mounting restartable snapshot ......................................................................... 34

Refreshing restartable snapshot ...................................................................... 37

Mounting restartable snapshot with a new DBID and file location ................ 40

Restoring restartable snapshot ........................................................................ 45



Restartable database snapshots overview and requirements

Key reasons for creating restartable snapshots:

1. Production ‘gold’ copies: Restartable snapshots are created or restored in

seconds, regardless of database size. Restartable snapshots don’t consume

additional storage capacity upon creation, and don’t require complex database

recovery. For these reasons, such snapshots can be taken as “gold copies” prior

to applying database patches, server upgrades, or no-logging batch loads. If

something happened to the database, the snapshot can be restored in seconds5,

returning the database to its state prior to the operation.

2. Creating new database copies: Restartable snapshots provide a quick and easy

way to create new instances of the production database for purposes such as

test, development, or reports. The snapshot database starts as an identical copy

of production’s database data from the time of the snapshot and only consumes

additional storage for any changes made to the database on the source or linked

target devices. Sensitive data can be masked from the snapshot database before

exposing it to users, or it can be a source for additional snapshots, such as when

creating multiple test/development copies. The original or subsequent snapshots

can be refreshed together or separately.

NOTE: Snapshots are protected. That means that a snapshot of the production database can be

restored over and over during patch update, if it failed multiple times. That also means that if the

snapshot is used as a new copy of the database, and the database copy is modified (sensitive

data is masked for example), the snapshot original data is still intact and can be used to create

more copies of the original database. If a copy of the masked database is desired, a new snapshot

of the target devices with the masked database can be created, and that snapshot can be used as

a source to other database copies.

To satisfy the requirement for a restartable snapshot, it has to include all redo logs,

control, and data files, and has to be taken in a single storage-consistent snapshot

operation. Note that native SnapVX snapshots are always consistent, even if the snapshot

includes devices spread across multiple arrays.

Other files such as temp files or archive logs are not required. Consider including them if

the snapshot purpose is to create a database copy, which may require its own temp files

and/or archive logs, or if they are mixed with the other files and will be included in the

snapshot anyway (e.g. temp files sharing devices with data files).

The following steps show how to create a valid restartable database snapshot.

1. Identify the storage group that contains all the data files, control files, and

redo logs. Ideally, data files and redo logs have each their own ASM disk groups

and matching storage groups (e.g. data_sg for +DATA ASM disk group, and

redo_sg for +REDO ASM disk group). In that case, a parent SG, such as:

‘database_sg’ may already exist or can be added, containing data_sg and

5 As soon as a snapshot is restored, its data is available to the source devices, even as background

copy of the changed data takes place. If a host I/O is requested for data that hasn’t been copied yet

it will be prioritized.

Purpose

Requirements



redo_sg. This parent SG is used to create restartable database snapshots. Note

that it does not include the archive logs or +GRID devices in case of RAC.

2. Without any need to condition the production database (i.e, do not use hot-backup

mode), simply establish (create) the snapshot. The command requires the storage

group name and a user-provided snapshot name. Additional parameters such as

automatic expiration can be added, but are not used in this example.

symsnapvx -sg <SG_NAME> -name <SNAPSHOT_NAME> establish

Creating restartable database snapshot

This section explains how to create a restartable database snapshot using CLI.

Alternatively, you can use Unisphere for VMAX or REST APIs.

1. To demonstrate what data is preserved in the different scenarios, we used a test

table.

SQL> create table testTbl (Id int, Step varchar(255));

2. To simulate on-going database activity we started SLOB OLTP workload in the

background.

3. Insert a known record into the test table before taking the snapshot.

SQL> insert into testTbl values (1, 'Before snapshots taken');

SQL> commit;

4. Create a restartable snapshot.

# symsnapvx -sg database_sg -name database_snap establish

5. Insert another known record after the first snapshot.

SQL> insert into testTbl values (2, 'After first snapshot taken');

SQL> commit;

6. Optionally, create another snapshot.

NOTE: When the same storage group (SG) and snapshot name are used to create additional

snapshots, a new snapshot generation is created, where generation 0 always points to the latest

snapshot. When snapshots are listed, the date/time information of each generation is shown.

# symsnapvx -sg database_sg -name database_snap establish

7. Insert the last known record for this test.

SQL> insert into testTbl values (3, 'After second snapshot taken');

8. Inspect the snapshots created using any preferred level of detail.

symsnapvx -sg database_sg -snapshot_name database_snap list

symsnapvx -sg database_sg -snapshot_name database_snap list -gb -detail

symsnapvx -sg database_sg -snapshot_name database_snap list -gb -summary



Mounting restartable snapshot

The snapshot itself can never be used directly. In order to make its data available to other

devices, the snapshot has to be linked to a set of devices matching in size to the original

devices. We call them the linked target devices, target devices or target SG (since SGs

are used to manage groups of devices).

For this purpose we created a set of matching target devices and placed them in SGs

similar to production’s SGs, with the word ‘mount’ in the SG name (since we’ll be

presenting them to another host which we refer to as the ‘mount’ host). For the following

examples we’ve created redo_mount_sg, data_mount_sg, and a parent SG that included

both and was called database_mount_sg.

It is important to consider zoning and LUN masking, which are the operations of making

devices visible to hosts. In this example, the mount host is pre-zoned to the storage array,

and the target devices are placed in a masking view and made visible to the mount host,

even before the snapshot is linked.

Remember that if the devices are presented for the first time to the mount host, their

partitions will only become visible once the snapshot is linked to the target devices. This is

no longer a consideration if the snapshot is refreshed (relinked) as by then the partitions

will already be known to the mount host with the correct permissions.

When deciding which snapshot generation to use, it is best to first list the snapshot

generations (use Unisphere, or the ‘-detail’ option in the ‘symsnapvx list’ command) and

choose the appropriate snapshot. When linking a snapshot to target devices, if a

generation number isn’t used, gen 0 (the latest snapshot) is assumed. The CLI command

to link a snapshot to target device is shown below:

symsnapvx -sg <SG_NAME> -lnsg <TARGET_SG_NAME> -snapshot_name

<SNAPSHOT_NAME> [-generation <number>] link

If the target SG already has a snapshot linked to it, there is no need to ‘unlink’ it prior to

refreshing the target SG with another snapshot. Simply relink the new snapshot with the

same command as above, using the ‘relink’ option instead of ‘link’.

This procedure shows how to link a snapshot to target devices using CLI. Then, start the

target database and inspect the data.

1. Choose a snapshot to link. By listing the snapshots with -detail flag, each

generation and its date/time is shown.

# symsnapvx -sg database_sg -snapshot_name database_snap list -gb -detail



-------------------------------------------------------------------------------------------------------------

Total

Sym Flags Deltas Non-Shared

Considerations

Procedure



Dev Snapshot Name Gen FLRG TS Snapshot Timestamp (GBs) (GBs) Expiration

Date

----- -------------------------------- ---- ------- ------------------------ ---------- ---------- ----------

00067 database_snap 0 .... .. Sun Sep 10 18:20:29 2017 59.6 46.0 NA

database_snap 1 .... .. Sun Sep 10 18:07:26 2017 77.8 65.1 NA



…

Flags:

(F)ailed : X = Failed, . = No Failure

(L)ink : X = Link Exists, . = No Link Exists

(R)estore : X = Restore Active, . = No Restore Active

(G)CM : X = GCM, . = Non-GCM

(T)ype : Z = zDP snapshot, . = normal snapshot

(S)ecured : X = Secured, . = Not Secured

2. Link the snapshot to the target devices. We use generation 1, which is the first

database snapshot we took in the previous section.

# symsnapvx -sg database_sg -lnsg database_mount_sg -snapshot_name

database_snap link -gen 1

3. The target host should already be zoned and masked to the target devices. If this

is the first time a snapshot is made visible to the target host, you should reboot or

rescan the SCSI bus to make sure the devices and their partitions are recognized

by the mount host. Give the partitions (if used) or devices (otherwise) Oracle

permissions.

4. Log in to the ASM instance on the target host. The ASM disk groups on the target

devices should be in the unmounted state. Mount them using ‘asmcmd’ or SQL,

such as the following example.

[oracle@dsib0057 ~]$ TOGRID

[oracle@dsib0057 ~]$ sqlplus "/ as sysasm"

SQL*Plus: Release 12.2.0.1.0 Production on Sun Sep 10 10:35:17 2017

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to:

Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit

Production

SQL> select name, state from v$asm_diskgroup;

NAME STATE

------------------------------ -----------



REDO DISMOUNTED

DATA DISMOUNTED

GRID MOUNTED

SQL> alter diskgroup data mount;

Diskgroup altered.

SQL> alter diskgroup redo mount;

Diskgroup altered.


NAME STATE

------------------------------ -----------

GRID MOUNTED

DATA MOUNTED

REDO MOUNTED

5. Log in to the database instance on the mount host, and simply start the database.

Do not perform any media recovery. During this step Oracle performs crash or

instance recovery.

[oracle@dsib0057 ~]$ TODB

[oracle@dsib0057 ~]$ sqlplus "/ as sysdba"



Connected to an idle instance.

SQL> startup

ORACLE instance started.

Total System Global Area 1.5334E+10 bytes

Fixed Size 19255200 bytes

Variable Size 4429185120 bytes

Database Buffers 1.0737E+10 bytes

Redo Buffers 148516864 bytes

Database mounted.

Database opened.

Optional: If archive log mode is not necessary (or +FRA is not available) on the

mount host, the following example shows how to disable archiving before opening

the database.



SQL> startup mount;



SQL> alter database noarchivelog;

SQL> alter database open;

6. Inspect the data in the test table. Since we used generation 1, which was the first

snapshot, the data shows the table’s record from that time.

SQL> select * from testTbl;

ID STEP

---------- --------------------------------------------------

1 Before snapshots taken

Refreshing restartable snapshot

This section explains how to refresh the mount host with another snapshot.

1. Before linking a different snapshot to the target SG, first, bring down the database

and ASM disk groups on the mount host, as the target devices’ data is about to

be changed.

NOTE: If the target database is RAC, make sure to shutdown all the instances and

dismount the relevant ASM disk groups on all nodes.

a. Shutdown Oracle instance:



SQL*Plus: Release 12.2.0.1.0 Production on Mon Sep 11 10:17:50 2017


Connected to:

Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit

Production

SQL> shutdown immediate;

Database closed.

Database dismounted.

ORACLE instance shut down.

b. Dismount ASM disk groups:



SQL> alter diskgroup data dismount;

SQL> alter diskgroup redo dismount;



2. Choose a snapshot to link. By listing the snapshots with -detail flag, each

generation of a specific snapshot_name and its date/time is shown.




-------------------------------------------------------------------------------------------------------------

Total



Date

----- -------------------------------- ---- ------- ------------------------ ---------- ---------- ----------





...

3. Link the snapshot to the target devices. This time we’ll use generation 0, which is

the most recent snapshot (the second snapshot we took in the previous example).

NOTE: There is no need to terminate the previous snapshot first. Use the ‘relink’ option.

There is no need to choose ‘-gen 0’ because it is the default.


database_snap relink

NOTE: As before, the target host should already be zoned and masked to the target devices.


devices should be visible, though in an unmounted state. Mount them as follows.




NAME STATE

------------------------------ -----------

REDO DISMOUNTED

DATA DISMOUNTED

GRID MOUNTED


Diskgroup altered.


Diskgroup altered.




NAME STATE

------------------------------ -----------

GRID MOUNTED

DATA MOUNTED

REDO MOUNTED

5. Log in to the database instance on the mount host. Start the database. Do not

perform any media recovery. During this step Oracle performs crash or instance

recovery.





SQL> startup







Database mounted.

Database opened.

Alternatively, if, for example, archive logs were enabled on production but are not

needed on the mount host, disable archiving prior to opening the database on the

mount host:





SQL> startup mount;









Database mounted.


Database altered.


Database altered.

6. Inspect the data in the test table. Since we used generation 0, which was the

second snapshot, the data shows the table’s records from that time.


ID STEP

---------- ----------------------------------------


2 After first snapshot taken

Mounting restartable snapshot with a new DBID and file location

This use case is similar to the previous one, except that the database on the mount host

will use a different file location (ASM disk group name), instance ID (SID), database

name and DBID. This is a common practice when utilizing multiple production database

copies on a single mount host (for example, test/development copies). Each copy requires

a different name and location.

This procedure explains how to link a snapshot to target devices using CLI. Afterwards,

we’ll start the target database using the new file location and inspect the data.

NOTE: The initial steps are identical to the previous use case and therefore are not shown in

detail. Make sure the target database is down and the appropriate ASM disk groups (+DATA and

+REDO) are unmounted.

1. Choose a snapshot to link. By listing the snapshots with the -detail flag, each

generation and its date/time is shown.

2. Link the snapshot to the target devices. We’ll use in the example generation 0,

which is the latest database snapshot we took.

3. The mount host should already be zoned and masked to the target devices. If this

is the first time a snapshot is made visible to the mount host, reboot it or rescan

the SCSI bus online to make sure the devices and their partitions are recognized

by the host. Make sure the partitions (if used) or devices (otherwise) receive

Oracle permissions.

4. Rename the ASM disk groups.

Considerations

Procedure



Now that the target devices are visible to the mount host, create the new ASM

disk groups’ locations. If ASMlib is used, rename the ASMlib labels first, before

the disk groups can be renamed.

In this example, we’ll change the ASM disk group names from +DATA to

+DATA_ENV_1, and from +REDO, to +REDO_ENV_1.

ASM uses a text file to rename the disk group which contains a list of the devices,

the old name and the new name.

Since AFD labels were used when the disk groups were created, the labels can

be used in the text file as well. However, as can be seen in the +REDO rename

example below, as part of the rename execution, ASM changes the text file from

the labels back to the actual device names. For that reason, if the text file is going

to be used more than once, save a copy of it beforehand.

NOTE: The renamedg command requires the ASM disk_string parameter. It can be listed by using

the command: ‘asmcmd dsget’ or by looking at the ASM init.ora parameter ASM_DISKSTRING.

Rename the ASM disk group +DATA to +DATA_ENV_1:

[oracle@dsib0057 scripts]$ cat ora_asm_rename_data.txt

AFD:DATA0001 DATA DATA_ENV_1
















[oracle@dsib0057 scripts]$ renamedg dgname=DATA newdgname=DATA_ENV_1

config=./ora_asm_rename_data.txt asm_diskstring='/dev/emc*1,AFD:*'

Parsing parameters..

renamedg operation: dgname=DATA newdgname=DATA_ENV_1

config=./ora_asm_rename_data.txt asm_diskstring=/dev/emc*1,AFD:*

Executing phase 1

Discovering the group

Checking for hearbeat...

Re-discovering the group

Generating configuration file..

Completed phase 1

Executing phase 2



Completed phase 2

5. Rename the ASM disk group +REDO to +REDO_ENV_1:

[oracle@dsib0057 scripts]$ cat ora_asm_rename_redo.txt

AFD:REDO0001 REDO REDO_ENV_1








[oracle@dsib0057 scripts]$ renamedg dgname=REDO newdgname=REDO_ENV_1

config=./ora_asm_rename_redo.txt asm_diskstring='/dev/emc*1,AFD:*'


renamedg operation: dgname=REDO newdgname=REDO_ENV_1

config=./ora_asm_rename_redo.txt asm_diskstring=/dev/emc*1,AFD:*

Executing phase 1

Discovering the group

Checking for hearbeat...

Re-discovering the group

Generating configuration file..

Completed phase 1

Executing phase 2

Completed phase 2

## NOTICE THAT THE AFD LABELS IN THE TEXT FILES WERE CHANGED TO DEVICES

[oracle@dsib0057 scripts]$ cat ora_asm_rename_redo.txt

/dev/emcpowergx1 REDO REDO_ENV_1

/dev/emcpowerha1 REDO REDO_ENV_1

/dev/emcpowergz1 REDO REDO_ENV_1

/dev/emcpowergj1 REDO REDO_ENV_1

/dev/emcpowergs1 REDO REDO_ENV_1

/dev/emcpowergn1 REDO REDO_ENV_1

/dev/emcpowergh1 REDO REDO_ENV_1

/dev/emcpowergk1 REDO REDO_ENV_1

6. On the mount host, mount the ASM disk groups with their new names.

[oracle@dsib0057 scripts]$ TOGRID

[oracle@dsib0057 scripts]$ sqlplus "/ as sysasm"


NAME STATE

------------------------------ -----------

DATA_ENV_1 DISMOUNTED



REDO_ENV_1 DISMOUNTED

GRID MOUNTED

SQL> alter diskgroup DATA_ENV_1 mount;

Diskgroup altered.

SQL> alter diskgroup REDO_ENV_1 mount;

Diskgroup altered.

7. On the mount host, update the file names to their new location.

a. Update controlfile location to use the new ASM disk group location.

At the end of this step make sure the database is able to mount.

If using pfile, update the init.ora file parameters with the new ASM disk group

names.

[oracle@dsib0057 scripts]$ vi $ORACLE_HOME/dbs/initslob1.ora

control_files=('+DATA_ENV_1/cntrlSLOB.dbf')

[oracle@dsib0057 scripts]$ sqlplus "/ as sysdba"

SQL> startup mount;

i) If using spfile, use the ‘alter system set control_file’ command:

SQL> startup nomount;

SQL> alter system set control_files='+DATA_ENV_1/cntrlSLOB.dbf'

scope=spfile;

SQL> alter database mount;

b. Update Oracle files with their new location (database should be in mounted

state):

An example script is shown below. It assumes the ASM disk groups are

renamed from +DATA to +DATA_<NEW_NAME>, and +REDO to

+REDO_<NEW_NAME>. Update as necessary.

$ vi ora_rename_files.sh

export NEW_NAME=ENV_1

sqlplus -s "/ as sysdba" << EOF2

startup mount;

set linesize 132 pagesize 0 heading off feedback off verify off termout

off echo off

spool /tmp/ora_rename_redofile.sql

select 'alter database rename file ''' || member || ''' to ''' ||

member || ''';' from v\$logfile;

spool off;

spool /tmp/ora_rename_datafile.sql

select 'alter database rename file ''' || name || ''' to ''' || name ||

''';' from v\$datafile;

spool off;



spool /tmp/ora_rename_tempfile.sql

select 'alter database rename file ''' || name || ''' to ''' || name ||

''';' from v\$tempfile;

spool off;

quit;

EOF2

sed "s/+DATA/+DATA_$NEW_NAME/2" /tmp/ora_rename_datafile.sql >

/tmp/ora_rename_data.sql

sed "s/+REDO/+REDO_$NEW_NAME/2" /tmp/ora_rename_redofile.sql >

/tmp/ora_rename_redo.sql

sed "s/+DATA/+DATA_$NEW_NAME/2" /tmp/ora_rename_tempfile.sql >

/tmp/ora_rename_temp.sql

sqlplus "/ as sysdba" << EOF2

@/tmp/ora_rename_redo.sql

@/tmp/ora_rename_data.sql

@/tmp/ora_rename_temp.sql

alter database open;

quit;

EOF2

8. If archiving is not required, disable it.


9. Open the database for transactions.



second snapshot, the data shows the table’s records from that time.


ID STEP

---------- ----------------------------------------



11. If you are also changing the DBID and DBNAME, be sure to open and shut

down the database cleanly. Therefore, this step can only take place after

step 9, where the database is opened (so it can be shut down cleanly).

In addition, the NID utility requires the database to be mounted in exclusive mode.

After the NID utility runs successfully, update spfile or pfile with the new DBNAME

and ORACLE_SID for any relevant database parameters before restarting the

database.



SQL> startup mount exclusive

SQL> quit



[oracle@dsib0057 scripts]$ nid target=sys dbname=env_1

DBNEWID: Release 12.2.0.1.0 - Production on Mon Sep 25 11:28:53 2017

...

Control Files in database:

+DATA_ENV_1/cntrlslob.dbf

Change database ID and database name SLOB to ENV_1? (Y/[N]) => Y

...

Database name changed to ENV_1.

...

DBNEWID - Completed succesfully.

[oracle@dsib0057 scripts]$ cp $ORACLE_HOME/dbs/initslob1.ora

$ORACLE_HOME/dbs/initenv_1.ora

[oracle@dsib0057 scripts]$ vi $ORACLE_HOME/dbs/initenv_1.ora

#db_name = slob

db_name = env_1

# UPDATE ANY OTHER INIT.ORA PARAMETERS WITH THE INSTANCE NAME

[oracle@dsib0057 scripts]$ export ORACLE_SID=env_1


SQL> startup mount;

SQL> alter database open resetlogs;

Restoring restartable snapshot

In this use case we’ll restore the snapshot back to the production database. This use case

can be applied by taking a snapshot just before a patch update, batch, or no-logging load.

If anything goes wrong, restoring the snapshot brings the database to its state prior to the

operation (though any new data since the snapshot was created is lost. If it is needed, use

recoverable snapshot use case instead).

As before, no media recovery is performed – the database is simply started after the

snapshot is restored. As there are no reset-logs involved, all prior backups continue to be

valid. This is simply a fast and safe way of creating a short-term gold copy of the

production database without wasting capacity or time. The snapshot restore operation

itself takes seconds, even if background data copy of changes continues. If the host

requests data that is still being copied, it will be prioritized. As previously discussed,

snapshots are protected and therefore the snapshot can be used over and over again.

Make sure that the snapshot finishes the background copy before connecting users to the

production database again at full scale.

The CLI command to restore a snapshot is shown below:

Considerations



symsnapvx -sg <SG_NAME> -snapshot_name <SNAPSHOT_NAME> [-generation

<number>] restore

This section explains how to restore a snapshot to the production’s database devices

using CLI. Then, start the database and inspect the data.

1. Before restoring the snapshot, first, bring down the database and ASM disk

groups on the production host, as their data is about to be refreshed.

NOTE: if the production database is clustered, make sure to shut down and dismount the ASM

disk groups and instances on all nodes.

a. Shutdown Oracle RAC instances:

[oracle@dsib0144 ~]$ srvctl stop database -d slob

[oracle@dsib0144 ~]$ srvctl status database -d slob

Instance slob1 is not running on node dsib0144

Instance slob2 is not running on node dsib0146

b. Dismount ASM disk groups:

[oracle@dsib0144 ~]$ srvctl stop diskgroup -g data

[oracle@dsib0144 ~]$ srvctl stop diskgroup -g redo

[oracle@dsib0144 ~]$ srvctl status diskgroup -g redo

Disk Group redo is not running

[oracle@dsib0144 ~]$ srvctl status diskgroup -g data

Disk Group data is not running

2. Choose a snapshot to restore. By listing the snapshots with the -detail flag,

each generation and its date/time is shown.




-------------------------------------------------------------------------------------------------------------

Total



Date

----- -------------------------------- ---- ------- ------------------------ ---------- ---------- ----------





...

---------- ----------

2059.9 1772.1

Procedure



Flags:







3. Restore the snapshot. In this case, we restore the latest snapshot (generation 0).

Since it is the default value, there is no need to mention the generation in the

command.

# symsnapvx -sg database_sg -snapshot_name database_snap restore

4. Remount the ASM disk groups and start the Oracle database.

[oracle@dsib0144 ~]$ srvctl start diskgroup -g data

[oracle@dsib0144 ~]$ srvctl start diskgroup -g redo

[oracle@dsib0144 ~]$ srvctl status diskgroup -g data

Disk Group data is running on dsib0146,dsib0144

[oracle@dsib0144 ~]$ srvctl status diskgroup -g redo

Disk Group redo is running on dsib0146,dsib0144

5. Start the production database. Do not perform any media recovery. During this

step Oracle performs crash/instance recovery.

[oracle@dsib0144 ~]$ srvctl start database -d slob

[oracle@dsib0144 ~]$ srvctl status database -d slob

Instance slob1 is running on node dsib0144

Instance slob2 is running on node dsib0146


latest snapshot, the data shows the table’s records from that time.


ID STEP

---------- ----------------------------------------



Chapter 5: Recoverable Database Snapshots


Chapter 5 Recoverable Database Snapshots


Recoverable database snapshots overview and requirements ..................... 49

Creating recoverable database snapshot ........................................................ 51

Mounting recoverable snapshot ....................................................................... 52

Opening a recoverable database on a mount host ......................................... 54

Database integrity validation on a mount host ................................................ 57

RMAN backup offload to a mount host ............................................................ 59

RMAN minor recovery of production database using snapshot ................... 62

Production restore from a recoverable snapshot ........................................... 67

Instantiating an Oracle Standby Database using VMAX replications ........... 72



Recoverable database snapshots overview and requirements

Key reasons for creating recoverable snapshots:

1. Local database backup images: Every mission critical database requires a

backup strategy. VMAX snapshots are created or restored in seconds, regardless

of database size, and don’t consume any capacity at creation time. VMAX

recoverable snapshots therefore provide a favorable option for database backups

over traditional host-based backups.

Because it is so fast and easy to perform database backups using VMAX

snapshots, backups can be taken more often. Each snapshot is a full backup (the

current state of the database), while capacity consumed in the array is based on

the changes to the data since the snapshot was taken.

If the DBA decides to recover the production database, the snapshots are readily

available and can be restored in seconds, providing huge savings in recovery

time.

2. Read/Writable and yet protected backup image: Remember that SnapVX

snapshots are protected. That means that if the snapshot is linked to target

devices and accessed from the mount host, any changes to the data don’t affect

the snapshot itself and it remains a valid database backup. In other words, that

same database snapshot that is a valid backup image can be opened on the

mount host for read/write operations in either restartable fashion, or recoverable

(with resetlogs), and yet the snapshot itself remains a valid backup image readily

available.

3. RMAN backup offload and database integrity validation: By using VMAX

snapshots, the snapshot not only provides a readily available backup image of the

production database, but also that image can be mounted on a mount host. From

there, RMAN backup can take place, sending the backup to another target

outside the VMAX, such as Data Domain.

RMAN doesn’t care which host the backup or recovery is performed from or to.

Therefore, it can perform backups from the mount host, and still recover the

database on production host from that same backup.

In addition, RMAN incremental backups can continue to leverage Oracle Block

Change Tracking, even if the RMAN backups take place on the mount host.

RMAN can also perform database integrity validation from the mount host.

4. ProtectPoint: ProtectPoint is a product that integrates VMAX recoverable

snapshots with Data Domain. With each backup a snapshot is refreshed and the

delta (changes only) is sent directly from VMAX to Data Domain over FC fabric.

Data Domain uses its catalog, dedupe, compression and optional replication

features on the backups. Also the restore is very efficient as only the required

data changes are copied from the Data Domain back to the VMAX. ProtectPoint is

covered in a separate white paper.

5. Creating and refreshing an Oracle Standby Database: Oracle standby

database is used by Oracle Data Guard to maintain a remote copy of the

production database that can be opened even as it continues to apply logs.

Creating or refreshing such a standby database can take many hours for large

Overview

http://www.emc.com/collateral/white-papers/h14777-vmax3-protectpoint-file-system-agent-vmax3-wp.pdf



databases when relying on RMAN alone. SnapVX or SRDF can help shorten this

time by leveraging storage-based local or remote replications, and easily refresh

the target whenever it is necessary.

Use case requirements

To satisfy the requirements for a recoverable database snapshot, perform the following

steps:

1. For Oracle databases prior to 12c: place the database in hot-backup mode.

(Oracle 12c databases can leverage the Storage Snapshot Optimization feature

and don’t require hot-backup mode).

2. Create a snapshot containing all Oracle data files (+DATA).

NOTE: Although it isn’t a requirement, in our snapshot we’ll include both +DATA and +REDO

ASM disk groups, using the parent SG: database_sg. That allows us to use this snapshot as both

recovereable and restartable.

3. For Oracle databases prior to 12c: end hot-backup mode.

4. Switch redo logs, archive the current redo log, and capture a backup control file.

5. Create a snapshot containing the archive logs (+FRA).

By following these steps we create a backup image of the database alongside the

minimum set of archive logs that are required to successfully recover it, even if the

production database is wiped clean. Of course if additional redo or archive logs are

available from production, they can be used for additional recovery.

Additional notes:

Unlike the restart use cases, the recovery use case requires that the Oracle data

files and redo logs are separated to different storage devices and ASM disk groups.

The reason is that in the case of database recovery, only the +DATA snapshot will

be restored. We don’t want to overwrite the redo logs with the +REDO snapshot in

case the current production database redo logs survived and can be used for full

recovery. By separating data files and redo logs to different devices and ASM disk

groups, we can restore only the data files without overwriting the redo logs.

In the example below, we create the snapshot using the parent SG (database_sg),

and therefore include both data files and redo logs. This doesn’t conflict with the

previous point. The reason is that although the snapshot is performed on the parent

SG, in case of a restore, we restore only the child SG, i.e. just +DATA. So, why

did we create a snapshot with both +DATA and +REDO? The reason is that this

snapshot serves as a valid source for both a recoverable as well as a restartable

solution. If a restartable option is not desirable, change the process in step 2 above

to only include the +DATA ASM disk group (data_sg).

When the command ‘alter system switch logfile’ is executed, Oracle switches the

log file. When it does that, it needs to flush the dirty buffer cache associated with

the previous logs to disk. In a clustered environment, that can create a storm of

writes that may affect database performance. If that’s the case, consider using

FAST_START_MTTR_TARGET init.ora parameter. By tuning it correctly, Oracle

will limit the amount of dirty buffers in cache without affecting database

Requirements

https://docs.oracle.com/database/122/BRADV/user-managed-flashback-dbpitr.htm#BRADV727



performance. The outcome is that when the logs switch, the amount of writes won’t

be overwhelming. Some customers prefer to manually swich logs at the different

cluster nodes, one at a time. Our recommendation is to use

FAST_START_MTTR_TARGET to not add operational overhead.

Creating recoverable database snapshot

This section explains how to create a recoverable snapshot using CLI. Alternatively, you

can use Unisphere for VMAX or REST APIs.

1. To demonstrate what data is preserved in the different scenarios, we used a test

table with known records inserted before or after specific steps. To simulate user

workload during the tests, we ran SLOB OLTP benchmark on the source

clustered database.

SQL> create table testTbl (Id int, Step varchar(255)) tablespace slob;

SQL> insert into testTbl values (1, 'Before +DATA & +REDO snapshot');

SQL> commit;

2. Perform this step only if hot-backup mode is used (databases pre-12c), to begin

hot backup mode.

SQL> alter database begin backup;

3. To create a database snapshot that is only recoverable, include only the data

files: data_sg. For a database snapshot that is both recoverable and restartable,

include both data and redo logs together: database_sg.

NOTE: To simplify finding the snapshot time when hot-backup mode is not used, we

included the production host date/time in the snapshot name using ‘date’ command.

# TIMESTAMP=`ssh <db_host> 'echo $(date +"%Y%m%d-%H%M%S")'`

# symsnapvx -sg database_sg -name database_${TIMESTAMP} establish

4. Perform this step only if hot-backup mode is used (databases pre-12c), to end

hot-backup mode.

SQL> alter database end backup;

5. Perform this step only if RMAN incremental backups are offloaded to the mount

host. In that case, the BCT file version must be switched manually on the

production host (see details of this use case later in the section RMAN backup

offload to a mount host), just like RMAN would have done automatically at the end

of the backup if it was performed from the production host.

Make sure BCT is enabled, then switch its version.

SQL> select filename, status, bytes from v$block_change_tracking;

FILENAME STATUS BYTES

-------------------------------------------------- ---------- ----------

+DATA/change_tracking.f ENABLED 22085632

Procedure



SQL> execute dbms_backup_restore.bctswitch();

6. For demonstration purposes, insert another known record after the first snapshot.

SQL> insert into testTbl values (2, 'After +DATA & +REDO snapshot');

SQL> commit;

7. Perform post snapshot Oracle operations.

SQL> alter system switch logfile;

SQL> alter system archive log current;

SQL> alter database backup controlfile to '+FRA/CTRLFILE_BKUP' reuse;

8. Create a snapshot with the archive logs (ASM +FRA disk group, or fra_sg SG).

This snapshot includes sufficient archives to recover the database so it can open.

TIMESTAMP=`ssh <db_host> 'echo $(date +"%Y%m%d-%H%M%S")'`

symsnapvx -sg fra_sg -name fra_${TIMESTAMP} establish

9. For demonstration purposes, insert the last known record for this test.

SQL> insert into testTbl values (3, 'After +FRA snapshot');

SQL> commit;

10. To inspect the snapshots created, use the appropriate level of detail

symsnapvx list

symsnapvx -sg <sg_name> list

symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb –detail

symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb -summary

Mounting recoverable snapshot

SnapVX link considerations

SnapVX link considerations are similar to those for the Mounting restartable snapshot use

case, with the addition that we prepared a target SG for the FRA, called fra_mount_sg,

matching the production’s fra_sg with the archive logs.

As before, it is important to consider zoning and LUN masking operations, which make

devices visible to hosts. In this example, the mount host is pre-zoned to the storage array.

The target devices are placed in a masking view and made visible to the mount host, even

before the snapshot is linked. Remember that if partitions are used they will only become

visible to the mount host once the snapshot is linked, and at that time will require Oracle

permissions. This is no longer a consideration if the snapshot is refreshed (relinked) as by

then the partitions will already be set on the mount host with the correct permissions.

When the same snapshot_name is used, list the snapshots based on the SG and

snapshot name to choose the appropriate generation to link. When each snapshot name

is unique, use just the SG name to list the snapshots. Once a source and target SGs are

Considerations



linked, there is no need to terminate the link in order to relink another snapshot. Just use

the ‘relink’ option in the syntax.

symsnapvx -sg <SG_NAME> -lnsg <TARGET_SG_NAME> -snapshot_name

<SNAPSHOT_NAME> [-generation <number>] [re]link

This section explains how to link a snapshot to target devices using CLI. Then, we recover

the target database and inspect the data.

1. If the target storage groups are in use, make sure to shut down the database on

the mount host, and dismount the appropriate ASM disk groups prior to the link

operation. If the mount host uses RAC, make sure all nodes are included.

[oracle@dsib0057 slob]$ TODB

[oracle@dsib0057 slob]$ sqlplus "/ as sysdba"


[oracle@dsib0057 slob]$ TOGRID

[oracle@dsib0057 ~]$ asmcmd umount data

[oracle@dsib0057 ~]$ asmcmd umount redo

[oracle@dsib0057 ~]$ asmcmd umount fra

2. Choose a snapshot to link by first listing the snapshots for each storage group

(database_sg, and fra_sg), then link it to the matching SG, providing the desired

snapshot name. If a different snapshot was previously linked between the SGs,

use ‘relink’ in the syntax instead of ‘link’.




----------------------------------------------------------------------------

Sym Num Flags


----- -------------------------------- ---- ------- ------------------------

00067 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017

00068 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017

00069 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

...

# symsnapvx -sg database_sg -lnsg database_mount_sg link -snapshot_name

database_20171025-095033

# symsnapvx -sg fra_sg list

Storage Group (SG) Name : fra_sg


----------------------------------------------------------------------------

Sym Num Flags


Procedure



----- -------------------------------- ---- ------- ------------------------

0009B fra_20171025-095142 1 .... .. Wed Oct 25 09:51:42 2017

...

# symsnapvx -sg fra_sg -lnsg fra_mount_sg link -snapshot_name

fra_20171025-095142

3. The mount host should already be zoned and masked to the target devices. If this

is the first time a snapshot is made visible to the mount host, reboot it or rescan

the SCSI bus online to make sure the devices and their partitions are recognized

by the host and have Oracle permissions.

4. Log in to the ASM instance on the mount host. The ASM disk groups on the target

devices should be visible, though in an unmounted state. Mount them.




SQL> alter diskgroup fra mount;


NAME STATE

------------------------------ -----------

GRID MOUNTED

DATA MOUNTED

FRA MOUNTED

REDO MOUNTED

5. Restore the backup control file and mount the database.

[oracle@dsib0057 scripts]$ TODB

[oracle@dsib0057 scripts]$ rman target / catalog rco@catdb

...

RMAN> startup nomount;

RMAN> restore controlfile from '+FRA/CTRLFILE_BKUP';

RMAN> alter database mount;

Opening a recoverable database on a mount host

As explained earlier, you need to perform a minimum media recovery before you can

open the Oracle database. Once the minimum recovery is done (past the end hot-backup

mode marker, or past the snapshot-time), the database can be opened for read-only,

which allows running reports, or applying additional archive logs. If Oracle is opened for

read-write, a resetlogs is forced.

Before we start this use case make sure that the database is in a mounted state. Make

sure all ASM disk groups (+DATA, +REDO, and +FRA) are available from the two

snapshots that were linked to target devices, and visible to the mount host.

This section describes how to open the database in read-only mode:

Considerations

Procedure



1. If ‘hot-backup’ mode was used during the backup then skip this step. Otherwise,

‘snapshot time’ will be used during the media recovery. To identify the correct

snapshot time to use we’ll compare the snapshot time from the ‘symsnapvx list’,

with the snapshot time from the data file headers, and use the latest, as shown

below.

a. Inspect the snapshots time based on the ‘symsnapvx list’ command. When

the storage management clock is identical to the database server clock, both

timestamps will match – the one listed by the command, and the one we

added to the snapshot name. In this example they match (if they didn’t, we

would use the one from the snapshot name as it came from the database

server).

The snapshot that is currently linked to the target SG will have an ‘X’ under

the L in the ‘FLRG’ flags.




----------------------------------------------------------------------------

Sym Num Flags


----- -------------------------------- ---- ------- ------------------------


database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017


database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017

database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017

...

b. Compare the snapshot time from above with the data file headers

checkpoint_time. In the next step use the snapshot time that is the latest (in

other words, we want to make sure the data files are recovered to a point

beyond both snapshot and checkpoint times).

Check the data files’ header checkpoint time. You can run this command on

the mount host, since the database is in a mounted state.

$ cat ./ora_checkpoint_time.sh

#!/bin/bash

set -x

sqlplus "/ as sysdba" << EOF

column name format a50

set linesize 132

select name, checkpoint_change#, to_char(checkpoint_time, 'YYYY-MM-DD

HH24:MI:SS') checkpoint_time from v\$datafile_header;

quit;

EOF



$ ./ora_checkpoint_time.sh

NAME CHECKPOINT_CHANGE# CHECKPOINT_TIME

-------------------------------------------------- ------------------ -------------------

+DATA/SLOB/DATAFILE/system.257.953030737 44140451 2017-10-25 15:39:10

+DATA/SLOB/DATAFILE/sysaux.258.953030739 44140451 2017-10-25 15:39:10

+DATA/SLOB/DATAFILE/sys_undots.259.953030755 44140451 2017-10-25 15:39:10

+DATA/undotbs1.dbf 44140451 2017-10-25 15:39:10

+DATA/undotbs2.dbf 44140451 2017-10-25 15:39:10

+DATA/SLOB/DATAFILE/slob.263.953031317 44140451 2017-10-25 15:39:10

6 rows selected.

As shown, the file header checkpoint time is older than the symsnapvx output

time so we’ll use the latter.

2. Connect to the database and perform the minimum media recovery necessary to

open the database in read-only mode.

If hot backup mode was not used during backup then add the ‘snapshot time…’ to

the recover command.

Make sure to use the syntax of ‘using backup controlfile’ (Oracle will not

apply enough archive logs otherwise).

SQL> recover automatic database until cancel using backup controlfile

snapshot time '2017-10-25 16:00:03';

ORA-00279: change 27380508 generated at 10/25/2017 15:41:25 needed for

thread 2

ORA-00289: suggestion :

+FRA/SLOB/ARCHIVELOG/2017_10_25/thread_2_seq_196.708.958319975

ORA-00280: change 27380508 for thread 2 is in sequence #196

...

...

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

cancel

Media recovery cancelled.

SQL> alter database open read only;

Database altered.


ID STEP

---------- ------------------------------

1 Before +DATA & +REDO snapshot

2 After +DATA & +REDO snapshot

As shown, the database opened cleanly and both the first record, which was in

the data files at the time of the snapshot, and the record from after the snapshot,

which was in the archive logs, are present.



Database integrity validation on a mount host

Oracle won’t discover a database physical corruption until the block is read and identified

as corrupt. Since a long time can pass between the time the blocks are written until they

are read by the database, DBAs may often want to check the database proactively for

physical corruptions.

There are two ways to test the database for corruptions:

Use Oracle DB Verify utility

Use RMAN to simulate a backup but only read the data and verify it

DB Verify can run against data files only and does not test control files or redo logs. It can

run against files only while the database is not opened, which makes a good use case for

storage snapshot. The following steps explain how to use DB Verify to test the database

for corruptions:

1. The database on the mount host can be either offline, or in mounted state. It must

not be open to avoid any changes to the data files while they are being tested.

The following example creates a script that will validate the data files. It can be

modified to run a few validations simultaneously to allow more parallelism.



SQL> set echo off heading off feedback off

SQL> set pagesize 1000

SQL> spool ./ora_dbv_script.sh

SQL> select 'dbv file=' || name || ' logfile=' || ts# || '_' || file# ||

'.dbv_log' from v$datafile;

SQL> spool off;

SQL> quit

2. Review the file, and update the content as necessary.

[oracle@dsib0057 ~]$ cat ora_dbv_script.sh

dbv file=+DATA/SLOB/DATAFILE/system.257.953030737 logfile=0_1.dbv_log

dbv file=+DATA/SLOB/DATAFILE/sysaux.258.953030739 logfile=1_2.dbv_log

dbv file=+DATA/SLOB/DATAFILE/sys_undots.259.953030755 logfile=2_3.dbv_log

dbv file=+DATA/undotbs1.dbf logfile=3_4.dbv_log

dbv file=+DATA/undotbs2.dbf logfile=4_5.dbv_log

dbv file=+DATA/SLOB/DATAFILE/slob.263.953031317 logfile=6_6.dbv_log

3. Execute the script.

[oracle@dsib0057 ~]$ chmod +x ./ora_dbv_script.sh

[oracle@dsib0057 ~]$ nohup ./ora_dbv_script.sh &

# Review the nohup.out and log files output

RMAN can run against the full database, including control files, redo logs, and data files. It

is beyond the scope of this paper to cover RMAN; however here are the basic steps to

test a database using RMAN database validation:

Considerations

Using DB Verify

Using RMAN

validate

https://docs.oracle.com/cd/E11882_01/server.112/e22490/dbverify.htm#SUTIL013

https://docs.oracle.com/database/122/BRADV/validating-database-files-backups.htm#BRADV90063



1. Mount the database on the mount host so RMAN can connect to it.

2. Connect to the database from RMAN and perform the validation.

[oracle@dsib0057 ~]$ rman

RMAN> connect target /

RMAN> startup mount;

RMAN> validate database;

Starting validate at 25-OCT-17

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=702 instance=slob1 device type=DISK

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00006

name=+DATA/SLOB/DATAFILE/slob.263.953031317


name=+DATA/SLOB/DATAFILE/sysaux.258.953030739

input datafile file number=00004 name=+DATA/undotbs1.dbf

input datafile file number=00005 name=+DATA/undotbs2.dbf


name=+DATA/SLOB/DATAFILE/system.257.953030737


name=+DATA/SLOB/DATAFILE/sys_undots.259.953030755

channel ORA_DISK_1: validation complete, elapsed time: 00:11:05

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

1 OK 0 354365 393216 26691053

File Name: +DATA/SLOB/DATAFILE/system.257.953030737

Block Type Blocks Failing Blocks Processed

---------- -------------- ----------------

Data 0 26472

Index 0 9061

Other 0 3318

...

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

6 OK 0 473280 158597120 26808371

File Name: +DATA/SLOB/DATAFILE/slob.263.953031317

Block Type Blocks Failing Blocks Processed

---------- -------------- ----------------

Data 0 157286558

Index 0 329088

Other 0 508194



channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

including current control file for validation

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Control File and SPFILE

===============================

File Type Status Blocks Failing Blocks Examined

------------ ------ -------------- ---------------

Control File OK 0 1258

Finished validate at 25-OCT-17

RMAN backup offload to a mount host

Performance

Performing RMAN backups directly from the production host has two main disadvantages:

1. The backup process competes with the production host workload for resources

such as CPU, memory, and I/Os. By offloading the backup process to a mount

host, it will not compete with the production database host for these resources.

2. Performing backups directly from the production host means that if a recovery is

needed, the database image will first need to be restored before the recovery

operations can take place. Typical production databases can contain many

terabytes of data, which means that the initial restore operation can take a very

long time.

When using SnapVX to offload the backup to a mount host, first, a readily available

snapshot which is a valid backup image of the database is created. This snapshot is

linked to the target devices and mounted on the mount host. Therefore, an RMAN backup

taking place from the mount host does not compete for host resources with the production

database.

Secondly, the snapshot itself can be restored in seconds to the production host, and

recovery operations can resume immediately. This is a huge savings in recovery time,

compared to host-based backups.

It is also important to remember that RMAN is only concerned with the DBID and the

database files location. As such, RMAN can perform backups from a mount host and

recovery from the production host. For that reason it is a good practice to use RMAN

catalog (vs. a local controlfile) to keep track of backups. RMAN can connect to its catalog

over the network regardless if it is running on the mount host or production.

RMAN and Block Change Tracking (BCT)

A typical backup strategy with RMAN utilizes an initial full backup (level 0) followed by

incremental cumulative backups (level 1). Even when performing incremental backups,

Oracle needs to know which blocks to include, and by default performs a full database

scan to determine that.

Considerations



Alternatively, Oracle provides a bitmap that keeps track of what blocks have changed. The

bitmap is called Block Change Tracking, or BCT. When BCT is enabled, RMAN will

attempt to use it to make incremental backups much faster and more efficient.

When using BCT, keep a few considerations in mind:

A BCT is a file that can be stored externally or within ASM (but not in the Oracle

database). When enabled, the default file location is based on the init.ora

parameter DB_CREATE_FILE_DEST. Enable it using the following SQL command

(database can be open):

SQL> alter database enable block change tracking using file

'+DATA/change_tracking.f' reuse;



-------------------------------------------------- ---------- ----------

+DATA/change_tracking.f ENABLED 22085632

By default, a BCT file tracks eight versions, where each version resets the block

change information. As such, if more than seven incremental (level 1) backups are

performed prior to a new full (level 0), the BCT file won’t be able to provide RMAN

with sufficient information for efficient incremental backup and RMAN will revert to

performing a full database scan. The init.ora parameter ‘_bct_bitmaps_per_file’

can be set to a value greater than eight if that is a concern.

When RMAN performs the backup from production, it automatically switches the

BCT file version. When RMAN backup is offloaded to the mount host, the DBA will

execute the BCT switch on the production host manually, using the following

command:


These steps explain how to perform an RMAN backup offload to a mount host:

1. Preparation: you shoud be using an RMAN catalog since otherwise any backup

information will be stored on in the mount host database controlfile and will get

lost with each snapshot. First, register the database from the primary (production

database) ahead of time.

[oracle@dsib0144 ~]$ rman target / catalog rco@catdb

RMAN> register database;

2. The mount host database state should be ‘mounted’ prior to performing the

RMAN backup.

3. On the mount host, connect with RMAN to the database and catalog, and perform

the appropriate backup. This paper doesn’t cover the specifics of RMAN backups

but here is a simple example of a level 0 (full) backup:



Procedure



RMAN> run { backup incremental level 0 database; }

## Or, a better example for Oracle 12c:

time rman target / catalog rco@catdb msglog /tmp/rman.log append << EOF

run{

allocate channel ch1 device type disk format '+BACKUP';




backup incremental level 0 database tag 'incr lvl 0' section size

300G;

}

quit;

EOF

Here is an example of a level 1 (incremental) backup:



RMAN> run { backup incremental level 1 cumulative database; }

## Or, a better example for Oracle 12c:

time rman target / catalog rco/rco@catdb msglog /tmp/rman.log append <<

EOF

run{





backup incremental level 1 cumulative database tag 'incr lvl 1'

section size 300G;

}

quit

EOF

4. To see if the BCT file was used, execute the following query:

SQL> select file#, incremental_level, DATAFILE_BLOCKS, BLOCKS,

BLOCKS_READ,USED_CHANGE_TRACKING from v$backup_datafile;

After incremental level 0 (full):

FILE# INCREMENTAL_LEVEL DATAFILE_BLOCKS BLOCKS BLOCKS_READ USE

---------- ----------------- --------------- ---------- ----------- ---

2 0 3932160 306983 323263 YES



5 0 1310720 19221 1093760 YES

3 0 40960 176 40960 YES

4 0 1310720 19510 1245888 YES

1 0 393216 41868 393216 YES

6 0 158597120 158124107 158343167 YES

After incremental level 1 cumulative:

FILE# INCREMENTAL_LEVEL DATAFILE_BLOCKS BLOCKS BLOCKS_READ USE

---------- ----------------- --------------- ---------- ----------- ---

2 1 3932160 1 40035 YES

5 1 1310720 9953 30539 YES

3 1 40960 1 1 YES

4 1 1310720 10067 37979 YES

1 1 393216 1 13331 YES

6 1 158597120 1827665 144342049 YES

Blocks read (‘BLOCKS_READ’) indicates how much data was read by RMAN as

part of the backup. Data file blocks (‘DATAFILE_BLOCKS’) indicates the number

of blocks in each data file, and blocks (‘BLOCKS’) indicates how many blocks

were actually written as part of the backup.

RMAN minor recovery of production database using snapshot

In this example, the production database is still available, but suffered some form of

physical corruption. The snapshot is not restored directly to production, as it will overwrite

its data. Instead, its linked target SG is made visible to the production host. The ASM

+DATA disk group on the linked target SG is renamed to +RESTORED_DATA. RMAN

catalogs it and is used to recover the production database from that snapshot.

NOTE: Only +DATA is made visible to the production host. The assumption is that +REDO

and +FRA on the production host are intact and the corruption is found in the data files.

Be sure to read all the steps first. Especially pay attention to the steps leading to

preparing the text file to rename the ASM disk group from the snapshot target devices

described in steps 5 and 6.

To conduct an RMAN minor recovery of the production database using a snapshot, follow

these steps:

1. If the target SGs were previously mounted to a mount host, shut down the mount

host database and dismount the ASM disk groups.

NOTE: If the mount host database is RAC, make sure to shut down and dismount the

ASM disk groups and instances on all nodes.

a. On the mount host, shut down ALL Oracle instances:


Considerations

Procedure





b. Dismount the ASM disk groups:





SQL> alter diskgroup fra dismount;

2. On the production host, identify the corruption type and location. For

demonstration purposes we corrupted a database block6.

SQL> select * from corrupt_test where password='P7777';

select * from corrupt_test where password='P7777'

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 7, block # 154)

ORA-01110: data file 7: '+DATA/bad_data_01.dbf'

[oracle@dsib0144 ~]$ dbv file='+DATA/bad_data_01.dbf' blocksize=8192

DBVERIFY: Release 12.2.0.1.0 - Production on Mon Oct 30 12:19:05 2017

...

Total Pages Marked Corrupt : 1

3. Choose the appropriate recoverable snapshot and link it to the target SG.




----------------------------------------------------------------------------

Sym Num Flags


----- -------------------------------- ---- ------- ------------------------

00067 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017

database_20171030-102525 1 .... .. Mon Oct 30 10:25:25 2017

database_20171029-121717 1 .X.. .. Sun Oct 29 12:17:15 2017

database_20171029-121519 1 .... .. Sun Oct 29 12:15:18 2017

00068 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017

database_20171030-102525 1 .... .. Mon Oct 30 10:25:25 2017

database_20171029-121717 1 .X.. .. Sun Oct 29 12:17:15 2017

database_20171029-121519 1 .... .. Sun Oct 29 12:15:18 2017

00069 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017

...

6 The method to deliberately corrupt a database block in ASM is introduced in this blog.

https://jhdba.wordpress.com/2010/04/14/identifying-corrupt-blocks/



# symsnapvx -sg database_sg -lnsg database_mount_sg relink -snapshot_name

database_20171030-120916

4. VMAX uses initiator groups and masking views to make devices visible hosts. If

the database_mount_sg SG was visible to the mount host (based on

‘rac_mount_mv’ masking view), remove that masking view. Instead, create a new

masking view making data_mount_sg visible to the production host.

Only the child storage group data_mount_sg is made visible to production.

Database_mount_sg, which includes both data_sg and redo_sg, is not

made visible to production.

# symaccess list view

Symmetrix ID : 000197700048

Masking View Name Initiator Group Port Group Storage Group

------------------- ------------------- ------------------- -------------------

...

rac_mv rac_ig 048_pg database_sg

rac_mount_mv rac_mount_ig 048_pg database_mount_sg

...

# symaccess delete view -name rac_mount_mv

# symaccess create view -name rac_snap_mv -ig rac_ig -pg 048_pg -sg

data_mount_sg

# symaccess list view



------------------- ------------------- ------------------- -------------------

...

rac_mv rac_ig 048_pg database_sg

rac_snap_mv rac_ig 048_pg data_mount_sg

...

If this is the first time the data_mount_sg devices are made visible to the

production host, you may need to reboot or rescan the SCSI bus online so that

the host is aware of the devices, and can identify their partitions and associate

Oracle permissions to them.

If you reboot, ASM will not mount the +DATA disk group since it sees both

the original devices and the snapshot target devices. This ASM feature

protects its data and makes this procedure safe to follow. If that happened due to

a reboot, simply remount the production +DATA ASM disk group after the

snapshot ASM disk group was renamed, as described in the next step.

5. To rename the ASM disk group based on the data_mount_sg_SG, prepare a text

file containing the snapshot devices from data_mount_sg as they appear on the

production host. This file is then used to rename the ASM disk group.



To identify what devices on the production host match the devices of

data_mount_sg see VMAX device identification on a database server. Do not

forget to include the partition number if partitions are used. Here is an example:

# cat asm_renam_to_snap_data.txt

/dev/emcpowercc1 DATA SNAP_DATA

/dev/emcpowerci1 DATA SNAP_DATA

/dev/emcpowerbx1 DATA SNAP_DATA

/dev/emcpowerbs1 DATA SNAP_DATA

/dev/emcpowerbt1 DATA SNAP_DATA

/dev/emcpowerbv1 DATA SNAP_DATA

/dev/emcpowerbq1 DATA SNAP_DATA

/dev/emcpowerce1 DATA SNAP_DATA

/dev/emcpowercj1 DATA SNAP_DATA

/dev/emcpowerbw1 DATA SNAP_DATA

/dev/emcpowerck1 DATA SNAP_DATA

/dev/emcpowercb1 DATA SNAP_DATA

/dev/emcpowerby1 DATA SNAP_DATA

/dev/emcpowercg1 DATA SNAP_DATA

/dev/emcpowercd1 DATA SNAP_DATA

/dev/emcpowercn1 DATA SNAP_DATA

6. Run the ASM disk group rename command on the production host using the text

file.

[oracle@dsib0144 scripts]$ TOGRID

[oracle@dsib0144 scripts]$ asmcmd dsget

parameter:/dev/emc*1, AFD:*

profile:/dev/emc*1,AFD:*

[oracle@dsib0144 scripts]$ renamedg phase=two dgname=DATA

newdgname=SNAP_DATA config=./asm_rename_to_snap_data.txt

asm_diskstring='AFD:*'


renamedg operation: phase=two dgname=DATA newdgname=SNAP_DATA

config=./asm_rename_to_snap_data.txt asm_diskstring=AFD:*

Executing phase 2

Completed phase 2'

7. On the production host, mount the renamed disk group (and the original +DATA

disk group if it is not already mounted). Open the production database if it isn’t

already opened.



SQL> alter diskgroup snap_data mount;






8. RMAN can catalog the whole +SNAP_DATA ASM disk group, or specific files or

directories within the ASM disk group. Once it does, it becomes aware of that

backup image and can use it to recover the production database.


[oracle@dsib0144 ~]$ rman target /

Recovery Manager: Release 12.2.0.1.0 - Production on Mon Oct 30 15:23:06

2017

Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights

reserved.

connected to target database: SLOB (DBID=3679801137)

RMAN> catalog start with '+SNAP_DATA/bad_data_01.dbf' noprompt;

using target database control file instead of recovery catalog

searching for all files that match the pattern +SNAP_DATA/bad_data_01.dbf

List of Files Unknown to the Database

=====================================

File Name: +SNAP_DATA/bad_data_01.dbf

cataloging files...

cataloging done

List of Cataloged Files

=======================

File Name: +SNAP_DATA/bad_data_01.dbf

## An example of RMAN cataloging a whole disk group:

RMAN> catalog start with '+SNAP_DATA' noprompt;

9. Perform RMAN recovery based on the situation. In the example from step 2, there

was a single block corruption in data file 7 block 154. Verifythat the corruption

was fixed.


RMAN> recover datafile 7 block 154;

...

channel ORA_DISK_1: restoring block(s) from datafile copy

+SNAP_DATA/bad_data_01.dbf

starting media recovery

media recovery complete, elapsed time: 00:00:01

[oracle@dsib0144 ~]$ dbv file='+DATA/bad_data_01.dbf' blocksize=8192

DBVERIFY: Release 12.2.0.1.0 - Production on Mon Oct 30 15:30:37 2017



Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights

reserved.

DBVERIFY - Verification starting : FILE = +DATA/bad_data_01.dbf

DBVERIFY - Verification complete

Total Pages Examined : 1280

Total Pages Processed (Data) : 28

Total Pages Failing (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing (Index): 0

Total Pages Processed (Other): 131

Total Pages Processed (Seg) : 0

Total Pages Failing (Seg) : 0

Total Pages Empty : 1121

Total Pages Marked Corrupt : 0

Total Pages Influx : 0

Total Pages Encrypted : 0

Highest block SCN : 0 (0.0)

10. Once recovery operations are complete, dismount the +SNAP_DATA disk group

from the production host and remove the masking view rac_snap_mv. Optionally,

recreate the rac_mount_mv if the target SG should be made visible to the mount

host again.

Production restore from a recoverable snapshot

In this use case, the database is not in a state from which it can be recovered. The

snapshot is restored to the original data_sg, overwriting its data with the valid backup

image followed by database media recovery.

Note that only the data files portion of the snapshot is restored. The assumption is

that +REDO and +FRA on the production host are intact. If that’s not the case, they can

be restored as well. To restore only data, use the child SG: ‘data_sg’. To restore both data

and redo use the parent SG: ‘database_sg’. To restore FRA use the ‘fra_sg’ snapshot.

Be sure to read all the steps first. Especially make sure that production redo logs are not

overwritten by mistake by the snapshot restore.

Follow these steps to conduct a production restore from a recoverable snapshot:

1. We simulated a disaster by deleting the production database’s data files.




[oracle@dsib0144 ~]$ asmcmd

ASMCMD> rm -rf +DATA/SLOB/DATAFILE/*

Considerations

Procedure




PRCR-1079 : Failed to start resource ora.slob.db

CRS-5017: The resource action "ora.slob.db start" encountered the

following error:

ORA-01157: cannot identify/lock data file 1 - see DBWR trace file

ORA-01110: data file 1: '+DATA/SLOB/DATAFILE/system.257.953030737'

. For details refer to "(:CLSN00107:)" in

"/u01/oracle/diag/crs/dsib0144/crs/trace/crsd_oraagent_oracle.trc".

CRS-2674: Start of 'ora.slob.db' on 'dsib0144' failed


following error:






CRS-2632: There are no more servers to try to place resource

'ora.slob.db' on that would satisfy its placement policy

2. Shut down the production database and dismount the ASM disk group that will be

restored. Other disk groups can stay online. In this example, only +DATA is

restored.

NOTE: If the target database is RAC, be sure to shutdown and dismount the ASM disk

groups and instances on all nodes.

a. On the production host, shut down the Oracle database.



b. Dismount the +DATA ASM disk group.

NOTE: Make sure only +DATA is dismounted and not +REDO or +FRA, assuming they survived

the disaster.




3. List the snapshots and restore the desired snapshot. Note that we use ‘data_sg’

SG.

# symsnapvx list -sg data_sg

Storage Group (SG) Name : data_sg


----------------------------------------------------------------------------



Sym Num Flags


----- -------------------------------- ---- ------- ------------------------

00067 database_20171031-111302 1 .... .. Tue Oct 31 11:13:03 2017

database_20171030-120916 1 .X.. .. Mon Oct 30 12:09:18 2017

database_20171030-102525 1 .... .. Mon Oct 30 10:25:26 2017

database_20171029-121717 1 ..X. .. Sun Oct 29 12:17:16 2017

database_20171029-121519 1 .... .. Sun Oct 29 12:15:19 2017

00068 database_20171031-111302 1 .... .. Tue Oct 31 11:13:03 2017

database_20171030-120916 1 .X.. .. Mon Oct 30 12:09:18 2017

database_20171030-102525 1 .... .. Mon Oct 30 10:25:26 2017

database_20171029-121717 1 ..X. .. Sun Oct 29 12:17:16 2017

database_20171029-121519 1 .... .. Sun Oct 29 12:15:19 2017

...

4. The restore operation takes seconds, regardless of database size.

# symsnapvx -sg data_sg restore -snapshot_name database_20171031-111302

Data copy from the snapshot may proceed in the background and can be

monitored using the command below.

# symsnapvx list -sg data_sg -restored -detail -gb -i 30



-----------------------------------------------------------------------------------------

Sym Flgs Remaining Done

Dev Snapshot Name Gen F Snapshot Timestamp (GBs) (%)

----- -------------------------------- ---- ---- ------------------------ ---------- ----

00067 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.1 33

00068 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 68.2 31

00069 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.1 33

0006A database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.0 34

0006B database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.7 32

0006C database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.2 33

0006D database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.0 33

0006E database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.6 34

0006F database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.6 33

00070 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.3 33

00071 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.0 32

00072 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.7 34

00073 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.6 33

00074 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.9 34

00075 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.4 32

00076 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.5 33

----------

1064.1



...

In most cases, the DBA can proceed with the recovery operations without waiting

for the background copy to complete (assuming the storage utilization is not too

high). However, allow the background copy to finish before opening the database

to user access at scale.

5. Mount the +DATA disk group on the production host.





NAME STATE

------------------------------ -----------

DATA MOUNTED

FRA MOUNTED

GRID MOUNTED

REDO MOUNTED

6. Mount the database and perform media recovery. If hot-backup was not used

when the snapshot was created, use the ‘snapshot time’ syntax, similar to the

previous use case: Opening a recoverable database on a mount host, with the

exception that this time, the recovery takes place on the production database.



SQL> startup mount;

SQL> recover database until cancel using backup controlfile snapshot time

'2017-10-31 11:13:02';


thread 2

ORA-00289: suggestion :



Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

auto

...


thread 2

ORA-00289: suggestion : +FRA


ORA-00278: log file

'+FRA/SLOB/ARCHIVELOG/2017_10_31/thread_2_seq_275.647.958821307' no

longer

needed for this recovery



ORA-00308: cannot open archived log '+FRA'

ORA-17503: ksfdopn:2 Failed to open file +FRA

ORA-15045: ASM file name '+FRA' is not in reference form


Database altered.

QL> select * from testTbl;

ID STEP

---------- ----------------------------------------



a. If the online redo logs are not available, you can open the database with

reset logs.


SQL> startup mount;


Database altered.

b. If the online redo logs are available, apply the latest redo logs, as shown in

the following example.



RMAN> shutdown immediate;


RMAN> recover database;

Starting recover at 31-OCT-17





archived log for thread 1 with sequence 21 is already on disk as file



+REDO/SLOB/ONLINELOG/group_7.262.953030783

archived log file name=+REDO/SLOB/ONLINELOG/group_7.262.953030783

thread=2 sequence=7

archived log file

name=+FRA/SLOB/ARCHIVELOG/2017_10_31/thread_1_seq_21.955.958833345

thread=1 sequence=21



Finished recover at 31-OCT-17

RMAN> alter database open resetlogs;

Statement processed

RMAN> quit

7. The latest transactions are visible, as shown below.



ID STEP

---------- ----------------------------------------



3 After +FRA snapshot

8. After confirming that the restored snapshot has finished any background copy,

terminate the restore session (the snapshot itself is not terminated, only the

restore session, by specifying the option: ‘-restored’).

[root@dsib0144 ~]# symsnapvx -sg data_sg verify -restored -snapshot_name

database_20171031-143014

All devices in the group 'data_sg' are in 'Restored' state.

[root@dsib0144 ~]# symsnapvx -sg data_sg terminate -restored -

snapshot_name database_20171031-143014

9. The database is now available for all operations and all nodes can be brought

online. If the database was opened with resetlogs, create a new recoverable

backup image immediately as the new backup base.

Instantiating an Oracle Standby Database using VMAX replications

With VMAX replications, instantiating a standby database can be much easier for large

databases, as compared to using a backup and restore.

This use case shows how to instantiate the standby database using SnapVX. A similar

operation can take place using SRDF leveraging SnapVX remotely, or the SRDF R2

devices directly.

Important: Unlike the database backup and recovery use cases discussed earlier, standby

database in managed recovery is not well integrated with the ‘snapshot time’ alternative for using

hot-backup mode, even with Oracle 12c.

For that reason, if using hot-backup mode, the process of instantiating the standby is simpler. If

Considerations



using the ‘snapshot-time’, the instantiation it is done in two parts – first, recover the target

database using ‘snapshot-time’ as if it was a normal backup image, and second, once the

database is recovered past the snapshot time, turn the image into a standby database.

To prepare to instantiate an Oracle standby database using VMAX replications, follow

these steps:

1. On the production host, enable forced logging.

SQL> alter database force logging;

2. Configure the production database with standby redo logs (in case of a role

switch). Update the sizing below as appropriate.

SQL> alter database add standby logfile size 10G;




SQL> select l.group#,l.thread#,l.bytes/1024/1024/1024 GB,lf.type from

v$standby_log l, v$logfile lf where l.group#=lf.group#;

3. Update the production’s database init.ora/spfile parameters as appropriate. In this

example we used Hopkinton for the Primary database location and Austin for the

target.

##########################################

# For Standby DB

db_unique_name=hopkinton # Primary unique name

control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile location

#db_unique_name=austin # Standby unique name

#control_files=('+DATA/austin.ctl') # Standby controlfile location

LOG_ARCHIVE_DEST_1=

'LOCATION=USE_DB_RECOVERY_FILE_DEST

VALID_FOR=(ALL_LOGFILES,ALL_ROLES)

DB_UNIQUE_NAME=hopkinton'

LOG_ARCHIVE_DEST_2=

'SERVICE=austin ASYNC

VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)

DB_UNIQUE_NAME=austin'

REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE

LOG_ARCHIVE_FORMAT=%t_%s_%r.arc

FAL_SERVER=austin

DB_FILE_NAME_CONVERT='/austin/','/hopkinton/'

LOG_FILE_NAME_CONVERT='/austin/','/hopkinton/'

STANDBY_FILE_MANAGEMENT=AUTO

##########################################

4. Update TNSNAMES.ORA at both sites as shown below.

...

Procedure



austin =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = dsib0057)(PORT = 1521))

)

(CONNECT_DATA =

(SERVICE_NAME = austin)

)

)

hopkinton =

(DESCRIPTION =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = dsib0144)(PORT = 1521))

)

(CONNECT_DATA =

(SERVICE_NAME = hopkinton)

)

)

5. On the production database, enable archiving (if not already enabled).


SQL> startup mount;

SQL> alter database archivelog;


6. Copy the production database’s init.ora to the standby site and make any

appropriate changes, as shown below.

##########################################

# For Standby DB

#db_unique_name=hopkinton # Primary unique name

#control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile location

db_unique_name=austin # Standby unique name

control_files=('+DATA/austin.ctl') # Standby controlfile location

LOG_ARCHIVE_DEST_1=

'LOCATION=USE_DB_RECOVERY_FILE_DEST

VALID_FOR=(ALL_LOGFILES,ALL_ROLES)

DB_UNIQUE_NAME=austin'

LOG_ARCHIVE_DEST_2=

'SERVICE=austin ASYNC

VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)

DB_UNIQUE_NAME=hopkinton'

REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE

LOG_ARCHIVE_FORMAT=%t_%s_%r.arc

FAL_SERVER=hopkinton

DB_FILE_NAME_CONVERT='/hopkinton/','/austin/'

LOG_FILE_NAME_CONVERT='/hopkinton/','/austin/'

STANDBY_FILE_MANAGEMENT=AUTO



##########################################

Follow these steps to use hot-backup mode to create a replica for the standby database:

1. On the standby database host make sure the database instances are shut

down. Dismount the ASM disk groups +DATA, +REDO, +FRA.

SQL> sqlplus "/ as sysdba"


SQL> sqlplus "/ as sysasm"




2. For demonstration purposes, simulate a user transaction on the production host

by running SLOB OLTP workload. Add some records to the test table.

SQL> truncate table testTbl;

SQL> insert into testTbl values (1, 'Before +DATA, +REDO, and +FRA

snapshot');

SQL> commit;

3. On the production host, create a standby control file.

SQL> alter database create standby controlfile as '+DATA/austin.ctl'

reuse;

4. On the production host, start backup mode.


5. Create snapshots: database_sg (+DATA and +REDO), and fra_sg (+FRA)

# symsnapvx -sg database_sg -name stdby_database_$(date +"%Y%m%d-%H%M%S")

establish

# symsnapvx -sg fra_sg -name stdby_fra_$(date +"%Y%m%d-%H%M%S") establish

6. On the production host end backup mode, switch logs and archive.


SQL> insert into testTbl values (2, 'After +DATA, +REDO, and +FRA

snapshot');

SQL> commit;



7. Link or relink the new snapshots to the target storage groups.

Using hot-

backup mode



# symsnapvx -sg database_sg list # choose snapshot


stdby_database_20171020-202743 relink

# symsnapvx -sg fra_sg list # choose snapshot

# symsnapvx -sg fra_sg -lnsg fra_mount_sg -snapshot_name

stdby_fra_20171020-202749 relink

8. On the standby host, re-mount the ASM disk groups.





Starting the standby database from the replica

1. On the standby host, use the init.ora with the standby changes.

[oracle@dsib0057 ~]$ vi $DB_HOME/dbs/initslob1.ora

# For Standby DB


#control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile



2. Use the password file copied from Primary.

[oracle@dsib0057 dbs]$ scp dsib0144:$DB_HOME/dbs/orapwslob1

$DB_HOME/dbs/orapwslob1

3. Mount the standby database in managed recovery.


SQL> startup mount;

SQL> alter database recover managed standby database disconnect from

session;

4. To test the standby database, add some records to the production database.

SQL> insert into testTbl values (3, 'After standby managed recovery

started');

SQL> commit;



5. Open the standby database in read-only (managed recovery) mode, and inspect

the data to see if the updates from production are arriving. Note that it can take

some time for updates to show in the standby, based on how many logs are owed

to it.

SQL> alter database recover managed standby database cancel;




Database altered.


ID STEP

---------- --------------------------------------------------

1 Before +DATA, +REDO, and +FRA snapshot

2 After +DATA, +REDO, and +FRA snapshot

3 After standby managed recovery started


session;

Database altered.

SQL> select open_mode from v$database;

OPEN_MODE

--------------------

READ ONLY WITH APPLY

Creating a replica for the standby database

1. On the standby host make sure the database instances are shut down, and

dismount the ASM disk groups +DATA, +REDO, +FRA.







2. For demonstration purposes, SLOB was used to simulate OLTP user transactions

on the production host. Add some records to the test table.

SQL> truncate table testTbl;

SQL> insert into testTbl values (1, 'Before +DATA, +REDO, and +FRA

snapshot');

SQL> commit;

3. Create a standby control file and place it in the +DATA ASM disk group, since

we’ll be replicating that disk group to the standby site.

SQL> alter database create standby controlfile as '+DATA/austin.ctl'

reuse;

4. Create snapshots of just the database_sg (+DATA and +REDO). Do not use hot-

backup mode.

Using snapshot-

time



# symsnapvx -sg database_sg -name stdby_database_$(date +"%Y%m%d-%H%M%S")

establish

5. Capture the backup controlfile and archive the current log.

SQL> insert into testTbl values (2, 'After +DATA and +REDO snapshot');

SQL> commit;




6. Create a snapshot of fra_sg (+FRA).

# symsnapvx -sg fra_sg -name stdby_fra_$(date +"%Y%m%d-%H%M%S")

establish

7. Link or relink the new snapshots to the target storage groups.

# symsnapvx -sg database_sg list # choose snapshot

# symsnapvx -sg database_sg -lnsg database_mount_sg relink -snapshot_name

stdby_database_20171022-115008

# symsnapvx -sg fra_sg list # choose snapshot

# symsnapvx -sg fra_sg -lnsg fra_mount_sg relink -snapshot_name

stdby_fra_20171022-115035

8. On the standby host, re-mount the ASM disk groups.





As mentioned earlier, when using ‘snapshot time’ there are two steps: first perform a

manual media recovery on the standby host, using the ‘snapshot time’ syntax. Once the

database can be opened in read-only mode (when enough recovery has been performed),

convert the replica to a standby database. Following is the detailed description of these

steps.

Manual media recovery using ‘snapshot-time’

1. On the standby host, in order for automatic media recovery to find the

production database archive logs, you must use the same DB_UNIQUE_NAME

and CONTROL_FILE as production. Temporarily update the init.ora parameters

on the standby host accordingly.


# For Standby DB

db_unique_name=hopkinton # Primary db unique name

control_files=('+DATA/cntrlSLOB.dbf') # Primary db ctl location

#db_unique_name=austin # Standby db unique name

#control_files=('+DATA/austin.ctl') # Standby db ctl location



2. On the standby host, use the password file copied from production.

[oracle@dsib0057 dbs]$ scp dsib0144:$DB_HOME/dbs/orapwslob1

$DB_HOME/dbs/orapwslob1

3. On the standby host, optionally restore the backup controlfile.

[oracle@dsib0057 scripts]$ TODB

[oracle@dsib0057 scripts]$ rman

RMAN> connect target /

RMAN> startup nomount;

RMAN> restore controlfile from '+FRA/CTRLFILE_BKUP';

RMAN> shutdown

4. On the standby host, perform manual media recovery with the available archives

using the ‘snapshot time’ syntax.


snapshot time '2017-10-22 11:50:08';



ID STEP

---------- --------------------------------------------------


2 After +DATA and +REDO snapshot


Converting the replica to a standby database

1. On the standby host, update the init.ora to the standby parameters.


# For Standby DB


#control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile location



2. Mount the standby database in managed recovery.


SQL> startup mount;

SQL> show parameters control_files

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

control_files string +DATA/austin.ctl

SQL> show parameters unique

NAME TYPE VALUE



------------------------------------ ----------- ------------------------------

db_unique_name string austin


session;

3. To test the standby, add some records to the production database.

SQL> insert into testTbl values (3, 'After standby managed recovery

started');

SQL> commit;



4. Open the standby database and inspect the data. It could take some time until the

latest production data is shown in the standby, based on how many transactions

need to be recovered first.

SQL> alter database recover managed standby database cancel;


Database altered.


ID STEP

---------- --------------------------------------------------


2 After +DATA and +REDO snapshot

3 After standby managed recovery started


session;

Database altered.

SQL> select open_mode from v$database;

OPEN_MODE

--------------------

READ ONLY WITH APPLY

Chapter 6: Remote Replications with SRDF


Chapter 6 Remote Replications with SRDF


Remote replications with SRDF overview and requirements ......................... 82

Initiating database replications with SRDF ...................................................... 84

Failover operations to the remote site ............................................................. 89

Creating remote restartable database snapshots ........................................... 97

Mounting a remote restartable snapshot ......................................................... 98

Refreshing remote restartable snapshot ....................................................... 102

Mounting remote restartable snapshot with a new DBID and file location . 105

Creating remote recoverable database snapshots ....................................... 105

Mounting remote recoverable snapshot ........................................................ 107

RMAN backup offload to a remote mount host ............................................. 107

Opening a remote recoverable database on mount host ............................. 107

Production restore from a remote recoverable snapshot ............................ 107



Remote replications with SRDF overview and requirements

SRDF provides many ways in which the data can be replicated synchronously or

asynchronously between two or more VMAX storage arrays. Both SRDF/S (synchronous

replications) and SRDF/A (asynchronous replications) maintain the database consistency

at the remote site. While SRDF/S maintains consistency for each I/O, SRDF/A uses the

notion of ‘cycles’ and ensures that any two consecutive I/Os on the source array(s) both

enter the same cycle, or that the second I/O enters the next cycle, thus maintaining

consistency across cycles, where the default cycle time is 15 seconds.

Therefore, the target of both SRDF/S and SRDF/A is restartable, though with SRDF/A it

has a slight lag. That lag can increase during peak loads as data makes its way to the

remote site, and shrink back to the default 15 seconds afterwards. If a disaster hits the

source database, Oracle can start from the target array(s) as if the database went through

shutdown-abort or a server crash. It will simply restart, performing instance or crash

recovery using only data, control, and redo log files (no archive logs are used).

In addition, SRDF is tightly integrated with SnapVX. As a result, SnapVX can create

recoverable or restartable snapshots at the remote site without interrupting SRDF

replications, allowing all the use cases discussed previously for snapshots to be executed

from either the source or the target arrays while SRDF/S or SRDF/A are used.

Following are key use cases for creating remote replications with SRDF.

Remote database restart use cases:

1. Disaster Restart (DR): The main strength of SRDF/S or SRDF/A is its ability to

create a remote consistent copy of the database at another fault domain, whether

that is on a different array, building, data center, or continent. SRDF can easily

replicate while maintaining consistency across multiple related databases,

including any external data, and/or associated message queues. As a result, all

systems at the target are not only restartable, but also consistent with each

other.

2. Creating remote database copies: As in the local restartable database

snapshots use case, you can create new database environments at the remote

array, using the same steps, and without disturbing the on-going SRDF

replication.

Remote database recovery use cases:

1. Remote database backup images: As with the local recoverable database

snapshots use case, you can create valid backup images at the remote array,

using SnapVX, without disrupting SRDF. The remote recoverable images can be

restored either to the remote site, or to the local site. If a restore is needed to

the local array(s), SnapVX and SRDF restore will work in parallel to replicate

only the changed data.

2. RMAN backup offload to remote VMAX: As with the recoverable database

snapshots use case, you can offload RMAN backups to the remote array,

potentially at another data center. If necessary, RMAN recovery can be used on

either the remote or the local database.

Purpose



3. Creating and refreshing an Oracle standby database: As with the recoverable

database snapshots use case, you can use SRDF to instantiate an Oracle

standby database and refresh it using SRDF’s incremental sync capability.

Disaster Restart with SRDF use case requirements

All database redo logs, control, and data files must be replicated together

consistently.

SRDF must be in SRDF/S or SRDF/A mode where SRDF/S must be in a

‘Synchronized’ state and SRDF/A must be in a ‘Consistent’ state. It is best practice

to enable consistency whether using Sync or Async replications.

Disaster Recovery with SRDF and remote SnapVX use case requirements

The following are requirements for a remote database recovery use case:

Include data files and archive logs. Most often, SRDF replications focus on DR

(Disaster Restart), and therefore will already include all data, log, and control files.

In order to extend the use case to support recovery, the archive logs are simply

added to the remote replications (for example, the fra_sg SRDF group is added).

A remote recoverable image is created using SnapVX at the remote array, without

interrupting SRDF replications.

When SRDF/A is used, an SRDF ‘checkpoint’ command is issued prior to each

SnapVX establish command (snapshot creation). SRDF checkpoint makes sure

that data in the local array reaches the remote array. This is important in a recovery

use case.

For example, if a hot backup mode is used, the remote snapshot must be taken

after the database was placed in backup mode. Issuing the checkpoint command

prior to creating the remote snapshot ensures that the R2 devices contain the

backup mode state. Similarly, after issuing the ‘archive log current’ command, the

SRDF checkpoint command ensures that the R2 devices contain these latest

archive logs prior to the remote snapshot of the fra_sg.

SRDF management using an SG or a CG

SRDF has many topologies that satisfy different disaster protection and replications

strategies. Their management aspects are covered in the Solutions Enabler SRDF

Product Guide.

The configuration and examples used in this paper assume that there is a basic

environment with a single source and target arrays and a single SRDF group for

each Storage Group (SG). As a result, this paper uses SGs to manage the SRDF

replications. This can only be done when there is a single SRDF group associated

with each SG.

In an environment where database devices are spread across multiple arrays (and

SGs), or where multiple SGs must be replicated consistently, a Consistency Group

(CG) is created and SRDF is managed using the CG, not the SG.

While the examples in this paper do not cover the use of CGs or more complex

SRDF topologies, from an Oracle ASM and database perspective, the operations

remain the same. However, the SRDF management commands may change based

on the topology and use of CGs versus SGs.

Requirements

https://www.emc.com/collateral/TechnicalDocument/docu45690.pdf

https://www.emc.com/collateral/TechnicalDocument/docu45690.pdf



NOTE: See additional considerations for local and remote replications in Chapter 3.

Initiating database replications with SRDF

This example shows how to create database replications with SRDF. The primary use

case is Disaster Restart; however, by adding replications of +FRA ASM disk group and

utilizing remote SnapVX, other use cases are made available, as will be shown later.

Execute all storage commands from the local storage management host.

1. To set up replications between the matching storage groups on the local and

remote arrays, first create an SRDF group. An SRDF group declares which of

the SRDF adapters (RA’s) and ports of each array participate in the replications. It

also lets you provide an SRDF group number and label for ease of management.

To create the SRDF group follow these sub-steps:

a. List the SRDF adapters and ports on the local (048) and remote (047) arrays.

# symcfg -sid 048 list -ra all

Symmetrix ID: 000197700048 (Local)

S Y M M E T R I X R D F D I R E C T O R S

Remote Local Remote Status

Ident Port SymmID RA Grp RA Grp Dir Port

----- ---- ------------ -------- -------- ---------------

RF-1H 8 000197700047 2 (01) 2 (01) Online Online

8 000197700047 4 (03) 42 (29) Online Online

9 - - - Online Online


8 000197700047 4 (03) 42 (29) Online Online

9 - - - Online Online

# symcfg -sid 047 list -ra all

Symmetrix ID: 000197700047 (Remote)

S Y M M E T R I X R D F D I R E C T O R S

Remote Local Remote Status

Ident Port SymmID RA Grp RA Grp Dir Port

----- ---- ------------ -------- -------- ---------------


8 000197700048 42 (29) 4 (03) Online Online

Procedure



9 - - - Online PendOn


8 000197700048 42 (29) 4 (03) Online Online

9 - - - Online PendOn

b. Choose the appropriate ports on each array and use them to create

SRDF groups. Both VMAX arrays have directors 1H port 8 and 2H port 8

available as shown in the previous step. Also, SRDF group number 10 is not

already used.

# symrdf addgrp -label ora_db -rdfg 10 -sid 048 -dir 1H:8,2H:8 -

remote_sid 047 -remote_dir 1H:8,2H:8 -remote_rdfg 10

# symcfg list -rdfg 10

2. Create a replication session between the local and remote SGs using the

newly created SRDF group.

# symrdf -sid 048 -sg database_sg -rdfg 10 createpair -type R1 -remote_sg

database_sg -establish

3. Monitor the synchronization progress between the arrays.

The initial SRDF mode for the newly created SRDF group is Adaptive Copy Disk

(ACP_DISK). This mode operates efficiently for initial bulk data transfers and

works in batches, sending each of the required changes asynchronously,

although out of order. If the change rate on the source is low, the target may get

fully sync’d and the state of the pairs will change to ‘Synchronized’. However, if

there are many data changes in the production database, the ‘SyncInProg’ state

will remain and SRDF will continue to send updates to the target in batches.

# symrdf -sid 048 list -rdfg 10 -i 30

...

Symmetrix ID: 000197700048

Local Device View

-----------------------------------------------------------------------------

STATUS MODES RDF S T A T E S

Sym Sym RDF --------- ----- R1 Inv R2 Inv ----------------------

Dev RDev Typ:G SA RA LNK MDATE Tracks Tracks Dev RDev Pair

----- ----- -------- --------- ----- -------- -------- --- ---- -------------

00067 00178 R1:10 RW RW RW C.D1. 0 708487 RW WD SyncInProg

00068 00179 R1:10 RW RW RW C.D1. 0 708280 RW WD SyncInProg

00069 0017A R1:10 RW RW RW C.D1. 0 708831 RW WD SyncInProg

...

Synchronization rate : 2997.4 MB/S

Estimated time to completion : 00:08:14

Legend for MODES:



(M)ode of Operation : A = Async, S = Sync, E = Semi-sync, C = Adaptive

Copy

: M = Mixed, T = Active

(D)omino : X = Enabled, . = Disabled

(A)daptive Copy : D = Disk Mode, W = WP Mode, . = ACp off

Mirror (T)ype : 1 = R1, 2 = R2

Consistency (E)xempt: X = Enabled, . = Disabled, M = Mixed, - = N/A

...

4. Eventually, either all the data is copied and the SRDF pairs’ state shows

‘Synchronized’, or the number of invalid tracks between source and target arrays

shrinks sufficiently. At that time, change SRDF mode to either Sync, or Async

so that the target devices can be consistent with the source.

NOTE: Only SRDF/S or SRDF/A (and their variants, such as SRDF/A MSC, cascaded SRDF,

or SRDF/STAR) are valid consistent replications for Oracle. SRDF ACP does not keep the

target consistent with the source and is only meant for data refresh. Once SRDF mode has

changed to Sync or Async, the target devices are only consistent when their state is no

longer ‘SyncInProg’ and rather ‘Synchronized’ (for SRDF/S) or ‘Consistent’ (for SRDF/A).

In this example, SRDF replication mode is changed to synchronous.

# symrdf -sid 048 -rdfg 10 -sg database_sg set mode sync

NOTE: although the example above uses SG, it is recommended to enable consistency even for

SRDF/S, which requires the use of CG instead of an SG.

In this example, SRDF replication mode is changed to asynchronous.

# symrdf -sid 048 -rdfg 10 -sg database_sg set mode async

# symrdf -sid 048 -rdfg 10 -sg database_sg enable

NOTE: As shown above, SRDF/A allows enabling consistency using SG when a single SRDF

group is used. Otherwise, SRDF/A also requires the use of CG.

5. Review the state of the replications.

The example below shows SRDF/A mode and the devices in ‘Consistent’ state:

# symrdf list -rdfg 10


Local Device View

-----------------------------------------------------------------------------




----- ----- -------- --------- ----- -------- -------- --- ---- -------------



00067 00178 R1:10 RW RW RW A..1. 0 0 RW WD Consistent

00068 00179 R1:10 RW RW RW A..1. 0 0 RW WD Consistent

00069 0017A R1:10 RW RW RW A..1. 0 0 RW WD Consistent

...

The example below shows SRDF/S mode and the devices in ‘Synchronized’

state:

# symrdf list -rdfg 10


Local Device View

-----------------------------------------------------------------------------




----- ----- -------- --------- ----- -------- -------- --- ---- -------------

00067 00178 R1:10 RW RW RW S..1. 0 0 RW WD Synchronized

00068 00179 R1:10 RW RW RW S..1. 0 0 RW WD Synchronized

00069 0017A R1:10 RW RW RW S..1. 0 0 RW WD Synchronized

0006A 0017B R1:10 RW RW RW S..1. 0 0 RW WD Synchronized

...

Another way to view the replication state is using the ‘symrdf query’ command,

as shown below (SRDF/A example):

# symrdf -sid 048 -rdfg 10 -sg database_sg query


Symmetrix ID : 000197700048 (Microcode Version: 5977)

Remote Symmetrix ID : 000197700047 (Microcode Version: 5977)

RDF (RA) Group Number : 10 (09)

Source (R1) View Target (R2) View MODE

--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A

Logical Sym T R1 Inv R2 Inv K Sym T R1 Inv R2 Inv RDF Pair

Device Dev E Tracks Tracks S Dev E Tracks Tracks MACE STATE

--------------------------------- -- ------------------------ ---- ------------

N/A 00067 RW 0 201010 RW 00178 WD 0 0 A.X. SyncInProg

N/A 00068 RW 0 202016 RW 00179 WD 0 0 A.X. SyncInProg

N/A 00069 RW 0 204273 RW 0017A WD 0 0 A.X. SyncInProg

...

Total ------- ------- ------- -------

Track(s) 0 3239163 0 0

MB(s) 0 404895 0 0



Legend for MODE:


Copy



(C)onsistency State : X = Enabled, . = Disabled, M = Mixed, - = N/A


6. Optional: add +FRA to the replications.

As discussed earlier, you can add archive logs and or flashback logs to the

replications. If flashback logs are stored in the +FRA, use a CG that contains both

the fra_sg and database_sg devices. If only archive logs are stored in the +FRA, it

is sufficient to have them replicated in their own SRDF group, though they should

use the same replication mode as the database_sg (sync, or async).

To create a new SRDF group for fra_sg, follow a similar procedure to the one we

followed for database_sg. In the example below we used SRDF group number 11.

# symrdf addgrp -label ora_fra -rdfg 11 -sid 048 -dir 1H:8,2H:8 -

remote_sid 047 -remote_dir 1H:8,2H:8 -remote_rdfg 11

# symrdf -sid 048 -sg fra_sg -rdfg 11 createpair -type R1 -remote_sg

fra_sg -establish

# symrdf -sid 048 list -rdfg 10 -i 30

Wait until source and target devices have little to no difference, then:

# symrdf -sid 048 -rdfg 11 -sg fra_sg set mode async

# symrdf -sid 048 -rdfg 11 -sg fra_sg enable



Failover operations to the remote site

The first step in this use case is to change the target devices state from write-disabled

(WD), to read-writable (RW). There are different ways to achieve this, based on the

circumstances. For example, if the local site is still functional and SRDF replications are

intact, that would require a different process from a situation in which the local site is not

reachable. To accommodate both cases, we will execute the storage management

commands in this use case from the remote site. It is assumed that the remote array

SG names are identical to the local array SG names; therefore we will continue to use

database_sg and fra_sg on either array.

The next step is to bring up the remote ASM disk groups and database associated with

the R2 (remote) devices. If the R2 devices are already visible to the remote database

server(s) then this operation can take place immediately. However, if the remote database

server(s) are using the remote database snapshots, first change the remote servers to

point to the R2 devices directly.

Finally, either resume replications in the opposite direction, or once the local array is

available again, failback to the local site and resume replications again from there to the

remote site.

This section describes two scenarios: in the first scenario, both sites are reachable and

SRDF replications did not stop. In the second scenario, only the remote site is available

and SRDF stopped replicating.

If the local site is reachable and SRDF replications didn’t stop

If SRDF replications are still on-going, to make the target devices RW, ‘split’ SRDF to stop

the replications. If the production database is still running on the local array, consider

whether it should be brought down first (for example, if production operations are moved

to the remote site, then production should be shut down first on the local array). However,

if the production database remains operational on the local site, and if the business wants

to access the R2 devices directly, split SRDF without bringing down the local production

database.

Stopping SRDF replications using SRDF split

NOTE: If SRDF consistency was enabled, a split command will require a ‘-force’ flag.

1. Issue this command to stop SRDF replications using SRDF split:

# symrdf -sid 048 -rdfg 10 -sg database_sg split -force

2. If ASM +FRA disk groups were replicated in a different SRDF group, split that

group as well.

# symrdf -sid 048 -rdfg 11 -sg fra_sg split -force

At this point, the remote SRDF devices are in the RW state.

If the local site is NOT reachable and SRDF replications have been dropped

If the local site is not reachable (a true disaster), we would focus on the remote site alone,

making the devices available to Oracle. If SRDF/A was configured with the Transmit Idle

Considerations

Changing the target devices’ state to read-writable (RW)



setting (default), then the SRDF state will show as ‘TransIdle’, which means SRDF is

waiting for the last cycle to arrive from the R1 devices. If SRDF/S was configured (or if

SRDF/A mode was used but Transmit Idle was disabled), then the SRDF state will show

as ‘Partitioned’. Each case is described in the following sections.

SRDF state is ‘TransIdle’

If the SRDF state is ‘TransIdle’ (Transmit Idle), it means that SRDF was in async mode

and the R2 is waiting for the last cycle to complete. However, since the R1 is no longer

reachable (true disaster), SRDF will stay in this state waiting. In this case, perform a

‘failover -immediate’ operation on the R2 to make the devices RW immediately.




Remote Symmetrix ID : N/A (Microcode Version: N/A)


Target (R2) View Source (R1) View MODE

--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 WD 0 0 RW 00067 NA NA NA A... TransIdle

N/A 00179 WD 0 0 RW 00068 NA NA NA A... TransIdle

N/A 0017A WD 0 0 RW 00069 NA NA NA A... TransIdle

...

# symrdf -sid 47 -rdfg 10 -sg database_sg failover -immediate







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 RW 0 0 NR 00067 NA NA NA S... Partitioned




N/A 0017A RW 0 0 NR 00069 NA NA NA S... Partitioned

...

If fra_sg was replicated in its own SRDF group, repeat the steps for fra_sg.

# symrdf -sid 47 -rdfg 11 -sg fra_sg failover -immeidate

# symrdf -sid 47 -rdfg 11 -sg fra_sg query

At the end of this step, the remote SRDF devices are in RW state.

SRDF state is ‘Partitioned’

If the SRDF state is ‘Partitioned’ and the R2 devices are still write disabled (WD), then

they cannot be used by Oracle yet. Perform a failover operation on the R2 devices to

make them read-writable (RW).







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 WD 0 0 NR 00067 NA NA NA S... Partitioned

N/A 00179 WD 0 0 NR 00068 NA NA NA S... Partitioned

N/A 0017A WD 0 0 NR 00069 NA NA NA S... Partitioned

...

# symrdf -sid 47 -rdfg 10 -sg database_sg failover

An RDF 'Failover' operation execution is

in progress for storage group 'database_sg'. Please wait...

Read/Write Enable device(s) in (0047,010) on RA at target

(R2)...Done.

The RDF 'Failover' operation successfully executed for

storage group 'database_sg'.









--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------



N/A 0017A RW 0 0 NR 00069 NA NA NA S... Partitioned

...

If fra_sg was replicated in its own SRDF group repeat the steps for fra_sg.

# symrdf -sid 47 -rdfg 11 -sg fra_sg failover

# symrdf -sid 47 -rdfg 11 -sg fra_sg query

At the end of this step, the remote SRDF devices are in RW state.

Depending on how the business uses the remote servers during normal operations, they

may be accessing the remote snapshots and not the R2 devices. In that case, shut down

the database running from the remote snapshots and change the servers masking view to

point to the R2 devices instead. This section shows how to change the remote server

masking views from the remote snapshot target devices (database_mount_sg) to point

directly to the R2 devices (database_sg).

# symaccess –sid 47 list view



------------------- ------------------- ------------------- -------------------

database_mount_mv rac_ig 047_pg database_mount_sg

fra_mount_mv rac_ig 047_pg fra_mount_sg

...

# symaccess –sid 47 delete view -name database_mount_mv

# symaccess -sid 47 create view –name database_mv -ig rac_ig -pg 047_pg -

sg database_sg

# symaccess -sid 47 delete view –name fra_mount_mv

# symaccess -sid 47 create view –name fra_mv -ig rac_ig -pg 047_pg -sg

fra_sg

# symaccess –sid 47 list view

Making sure the

R2 devices are

visible to the

remote servers





------------------- ------------------- ------------------- -------------------

database_mv rac_ig 047_pg database_sg

fra_mv rac_ig 047_pg fra_sg

...

There are two options to resume replications: The first option is to failback to the local site

as soon it becomes available again and resume replications from there. The other option

is to resume operations at the remote site, switching the replications direction so now

SRDF replicates are from the remote to the local site. These two scenarios are described

below.

The initial state for either scenario is that both remote and local arrays are available and

connected by SRDF. In that case the SRDF state will change from ‘Partitioned’ to ‘Split’.

In this example, the R2 devices are in the RW state and the database is running at the

remote site.







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 RW 0 0 NR 00067 RW 0 24832 A.X. Split

N/A 00179 RW 0 0 NR 00068 RW 0 24915 A.X. Split

N/A 0017A RW 0 0 NR 00069 RW 0 24727 A.X. Split

...

Switching SRDF replications to replicate from the remote to the local site

1. Swap personality, making the remote site devices R1, and the local site devices

R2.

# symrdf -sid 47 -rdfg 10 -sg database_sg swap




Resuming SRDF

replications






--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 RW 0 0 NR 00067 RW 0 0 A... Split

N/A 00179 RW 0 0 NR 00068 RW 0 0 A... Split

N/A 0017A RW 0 0 NR 00069 RW 0 0 A... Split

...

NOTE: After the ‘swap’, the remote array (047) will be shown as the ‘Source (R1)’ instead of the

‘Target (R2)’.

2. Resume replications from the remote to the local site. If time has passed and

many changes have accumulated on the remote array, use adaptive copy mode

to do a batch update. Once the data differences between R1 and R2 devices are

small enough (either completely synchronized, or there is only a small delta that is

remains stable between batch updates), change mode to sync or async.

# symrdf -sid 47 -rdfg 10 -sg database_sg set mode acp_disk

An RDF Set 'ACp Disk Mode ON' operation execution is in

progress for storage group 'database_sg'. Please wait...

The RDF Set 'ACp Disk Mode ON' operation successfully executed

for storage group 'database_sg'.

# symrdf -sid 47 -rdfg 10 -sg database_sg establish







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 RW 0 0 RW 00067 WD 0 0 CD.. Synchronized



N/A 00179 RW 0 0 RW 00068 WD 0 0 CD.. Synchronized

N/A 0017A RW 0 0 RW 00069 WD 0 0 CD.. Synchronized

...

Legend for MODE:


Copy



(C)onsistency State : X = Enabled, . = Disabled, M = Mixed, - = N/A


# symrdf -sid 47 -rdfg 10 -sg database_sg set mode async

An RDF Set 'Asynchronous Mode' operation execution is in

progress for storage group 'database_sg'. Please wait...

The RDF Set 'Asynchronous Mode' operation successfully executed

for storage group 'database_sg'.







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00178 RW 0 0 RW 00067 WD 0 0 A... Consistent


N/A 0017A RW 0 0 RW 00069 WD 0 0 A... Consistent

...

NOTE: Repeat the same operation for fra_sg if it was replicated in a different SRDF group.

Remember to enable consistency on both SRDF groups.

Resuming SRDF replications from local site to remote site

NOTE: In the example below, execute SRDF commands from the local site, which is available.

1. Database operations may be running at the remote site. First, update the R1

devices with the R2 data without disturbing remote database operations.



Repeat the ‘update’ command as necessary until the differences between R2 and

R1 devices are sufficiently low.

# symrdf -sid 48 -rdfg 10 -sg database_sg write_disable r1 -force







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------

N/A 00067 WD 0 0 NR 00178 RW 0 0 A... Failed Over

N/A 00068 WD 0 0 NR 00179 RW 0 0 A... Failed Over

N/A 00069 WD 0 0 NR 0017A RW 0 0 A... Failed Over

...

# symrdf -sid 48 -rdfg 10 -sg database_sg update

2. When ready to procced with replications from the local site, if Oracle was running

at the remote site, bring down the database and dismount the associated

ASM disk groups.

Use SRDF ‘failback’ to resume replications again from local to remote site.

# symrdf -sid 48 -rdfg 10 -sg database_sg failback







--------------------------------- ------------------------ ---- ------------

ST LI ST

Standard A N A



--------------------------------- -- ------------------------ ---- ------------





N/A 00069 RW 0 0 RW 0017A WD 0 0 A... Consistent

N/A 0006A RW 0 0 RW 0017B WD 0 0 A... Consistent

...

NOTE: After the ‘failback the local array (048) will show itself as the ‘Source (R1)’ again.

NOTE: Repeat the same operations for fra_sg if it was replicated in a different SRDF group.

Creating remote restartable database snapshots

Since SRDF is natively a restartable solution (meaning that the replicated data already

includes all the data, log, and control files), simply create a remote snapshot of the SRDF

remote database devices. Assuming the same SG names are used on the local and

remote arrays, create a remote restartable snapshot of database_sg.

If fra_sg is required at the remote site (that is, if the remote database copies require the

use of archive logs) then make fra_sg part of the replication so it can be snapped as well.

The SRDF remote device state should be either ‘Synchronized’ (for SRDF/S), or

‘Consistent’ (for SRDF/A) before taking the snapshot.

The steps to create a remote restartable copy of the database are identical to creating

such a copy from the local array. The only difference is that the snapshot is created from

the remote array, and linked to target devices in the remote array. The database copy is

accessed by the remote servers.

While Solutions Enabler SnapVX commands to the remote array can be executed from

either the local or remote storage management hosts, it is simpler to execute them from

the remote storage management host to avoid any confusion with the local array.

This example shows how to create a remote restartable snapshot using CLI.

1. To demonstrate what data is preserved in the different scenarios, use a test table

in the production database on the local array.

SQL> create table testTbl (Id int, Step varchar(255));

2. To simulate on-going database activity, start the SLOB OLTP workload in the

background.

3. Insert a known record into the test table before taking the snapshot.

SQL> insert into testTbl values (1, 'Before snapshots taken');

SQL> commit;

4. Create a remote restartable snapshot.

Optionally, verify that the devices are in the ‘synchronized’ state for SRDF/S

replications, or in the ‘consistent’ state for SRDF/A replications.

# symrdf -sid 047 -rdfg 10 -sg database_sg verify -consistent

Considerations

Creating a remote database snapshot

Procedure



All devices in the group 'database_sg' are in 'Consistent' state.

Create a remote restartable snapshot.

# symsnapvx –sid 047 -sg database_sg -name database_snap establish

NOTE: The ‘-sid’ option is used in the example to make sure the snapshot is created at the 047

[remote] array. If the commands are executed from the local array (048) storage management

host then ‘-remote’ is needed in the command syntax.

5. Insert another known record after the first snapshot.

SQL> insert into testTbl values (2, 'After first snapshot taken');

SQL> commit;

6. Optionally, create another snapshot.

NOTE: When the same SG and snapshot name are used to create additional snapshots, a new

snapshot generation is created, where generation 0 always points to the latest snapshot. When

snapshots are listed, the date/time information of each generation is shown.

# symsnapvx –sid 047 -sg database_sg -name database_snap establish

7. Insert the last known record for this test.

SQL> insert into testTbl values (3, 'After second snapshot taken');

8. To inspect the snapshots created, use the appropriate level of detail.

symsnapvx -sid 047 -sg database_sg -snapshot_name database_snap list

symsnapvx -sid 047 -sg database_sg -snapshot_name database_snap list –gb

-detail

symsnapvx -sid 047 -sg database_sg -snapshot_name database_snap list -gb

-summary

Mounting a remote restartable snapshot

To access the remote snapshot, assign target devices matching in size to the R2 devices

(which are the snapshot source), just like we did earlier in Mounting restartable snapshot

use case for the local array. For ease of management, name these remote SGs the same

as on the local array: the name will be database_mount_sg, which is a parent SG to

data_mount_sg and redo_mount_sg. If a remote fra_sg snapshot was created then we

should also create fra_mount_sg at the remote array.

Create a masking view so the remote servers can access the snapshot target devices

(database_mount_sg, and fra_mount_sg if fra_sg was replicated).

Remember that +FRA is not required for a restartable solution, although you may prefer to

include it so the replicated database can have a place to write archive logs. On the other

hand, if you prefer to open the replica without archive logs, or to have a different +FRA on

Preparations



the remote mount host, then there is no need to create a snapshot of fra_sg. In the

example below we’ll create the fra_sg snapshot.

This example shows how to create the remote target devices and add them to a masking

view:

## Create devices matching in size to the Production devices

# symdev create -v -tdev -cap 100 -captype gb -N 16 # +DATA

STARTING a TDEV Create Device operation on Symm 000197700047.

The TDEV Create Device operation SUCCESSFULLY COMPLETED: 16 devices

created.

16 TDEVs create requested in request 1 and devices created are 16[

0019A:001A9 ]

Create devices operation succeeded.

# symdev create -v -tdev -cap 50 -captype gb -N 8 # +REDO



created.

8 TDEVs create requested in request 1 and devices created are 8[

001AA:001B1 ]


# symdev create -v -tdev -cap 250 -captype gb -N 1 # +FRA



created.

1 TDEVs create requested in request 1 and devices created are 1[ 001B7 ]


## Create SG’s

# symsg create data_mount_sg

# symsg create redo_mount_sg

# symsg create fra_mount_sg

## Populate SG’s

# symsg -sg data_mount_sg addall -devs 19A:1A9

# symsg -sg redo_mount_sg addall -devs 1AA:1B1

# symsg -sg fra_mount_sg add dev 1B7

## Create Parent SG for data and redo

# symsg create database_mount_sg

# symsg -sg database_mount_sg add sg data_mount_sg,redo_mount_sg

## Create masking view for the remote mount hosts

# symaccess -sid 047 create view -name database_mount_mv

-sg database_mount_sg -pg 047_pg -ig rac_ig

Creating remote

target devices



# symaccess -sid 047 create view -name fra_mount_mv

-sg fra_mount_sg -pg 047_pg -ig rac_ig

# symaccess -sid 047 list view



------------------- ------------------- ------------------- -------------------

database_mount_mv rac_ig 047_pg database_mount_sg

fra_mount_mv rac_ig 047_pg fra_mount_sg

grid_mv rac_ig 047_pg grid_sg

...

As in the local array use case, if using RAC, install Grid Infrastructure ahead of time

locally for the remote servers instead of replicating it from the local array. In the example

above, we can see the grid_mv masking view which was created ahead of time and used

for +GRID ASM disk group during GI installation.

This example shows how to link a snapshot to target devices using CLI. Afterwards, we’ll

start the target database and inspect the data.

1. Choose a snapshot generation ID to link. By listing the snapshots with the -detail

flag, each generation and its date/time is shown.

# symsnapvx –sid 047 -sg database_sg -snapshot_name database_snap list -

gb -detail



-------------------------------------------------------------------------------------------------------------

Total



Date

----- -------------------------------- ---- ------- ------------------------ ---------- ---------- ----------

00178 database_snap 0 .... .. Tue Nov 21 07:15:52 2017 33.8 3.6 NA

database_snap 1 .... .. Tue Nov 21 07:15:20 2017 34.2 3.9 NA



00179 database_snap 0 .... .. Tue Nov 21 07:15:52 2017 33.9 3.5 NA




...


---------- ----------

2312.3 120.0

Linking

snapshots to

target devices

using CLI



Flags:







2. Link the snapshot to the target devices using generation 1, which is the first

database snapshot from the previous example.

# symsnapvx –sid 047 -sg database_sg -lnsg database_mount_sg -

snapshot_name database_snap link -gen 1

3. If fra_sg snapshot is necessary, repeat for fra_sg.

# symsnapvx -sg fra_sg -snapshot_name fra_snap list -gb -detail

# symsnapvx -sg fra_sg -lnsg fra_mount_sg -snapshot_name fra_snap link -

gen 1

4. Make sure the target host is zoned and masked to the target devices. If this is the

first time a snapshot is made visible to the target host, reboot the host or rescan

the SCSI bus online to make sure the devices and their partitions are seen by the

host. Make sure the partitions (if used) or devices (otherwise) receive Oracle

permissions.

5. Log in to the ASM instance on the target host. Make sure that the ASM disk

groups on the target devices are visible and in the unmounted state, then mount

them.


[oracle@dsib0057 ~]$ asmcmd mount data

[oracle@dsib0057 ~]$ asmcmd mount redo

[oracle@dsib0057 ~]$ asmcmd mount fra

[oracle@dsib0057 ~]$ asmcmd lsdg

State Type Rebal Sector Logical_Sector Block AU Total_MB Free_MB

Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name

MOUNTED EXTERN N 512 512 4096 4194304 1638400 274296

0 274296 0 N DATA/

MOUNTED EXTERN N 512 512 4096 4194304 256000 220432

0 220432 0 N FRA/

MOUNTED NORMAL N 512 512 4096 4194304 122880 45400

40960 2220 0 Y GRID/

MOUNTED EXTERN N 512 512 4096 4194304 409568 286372

0 286372 0 N REDO/

6. Log in to the database instance on the mount host, and simply start the database.

Do not perform any media recovery. During this step Oracle performs crash or

instance recovery.





SQL*Plus: Release 12.2.0.1.0 Production on Tue Nov 21 09:11:57 2017



SQL> startup







Database mounted.

Database opened.

Optional: If archive log mode is not necessary (or +FRA is not available) on the

mount host, the following steps show how to disable archiving before opening the

database.


SQL> startup mount;



7. Inspect the data in the test table. Since we used generation 1, which was the first

snapshot, the data in the table reflects the records from before that snapshot.


ID STEP

---------- --------------------------------------------------


Refreshing remote restartable snapshot

This use case shows how to refresh a remote restartable snapshot. The process is similar

to performing this operation from the local array.

1. Before linking a different snapshot generation to the target SG, bring down the

database and ASM disk groups on the mount host, because the target devices’

data is about to be refreshed.

NOTE: If the target database is RAC, make sure to shut down all the instances and

dismount the relevant ASM disk groups on all nodes.

Procedure



2. Choose a snapshot generation ID to link. By listing the snapshots with the -detail

flag, each generation and its date/time is shown.


3. Link the appropriate snapshot to the target devices. This time we’ll use generation

0, which is the second (latest) snapshot we took in the previous example.

NOTE: We use ‘relink’ in the syntax. There is no need to terminate the previous

snapshot, just relink, using the new generation ID. There is no need to mention –gen 0 as

it is the default.


database_snap relink

4. If fra_sg snapshot is necessary, repeat for fra_sg.

# symsnapvx -sg fra_sg -snapshot_name fra_snap list -gb -detail

# symsnapvx -sg fra_sg -lnsg fra_mount_sg -snapshot_name fra_snap relink

5. As before, the target host should already be zoned and masked to the target

devices. No action.


devices should be visible, though in unmounted state. Mount them.


[oracle@dsib0057 ~]$ asmcmd mount data

[oracle@dsib0057 ~]$ asmcmd mount redo

[oracle@dsib0057 ~]$ asmcmd mount fra

[oracle@dsib0057 ~]$ asmcmd lsdg

State Type Rebal Sector Logical_Sector Block AU Total_MB

Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files

Name

MOUNTED EXTERN N 512 512 4096 4194304 1638400 274296

0 274296 0 N DATA/

MOUNTED EXTERN N 512 512 4096 4194304 256000 219044

0 219044 0 N FRA/

MOUNTED NORMAL N 512 512 4096 4194304 122880 45400

40960 2220 0 Y GRID/

MOUNTED EXTERN N 512 512 4096 4194304 409568 286372

0 286372 0 N REDO/

7. Log in to the database instance on the target host. Start the database. Do not

perform any media recovery. During this step Oracle performs crash recovery.



SQL> startup

...

Database mounted.

Database opened.




second snapshot, the data in the table reflects the records from just before that

snapshot.


ID STEP

---------- ----------------------------------------





Mounting remote restartable snapshot with a new DBID and file location

Mount a remote restartable snapshot with a new ASM disk group locations, new instance

name and DB name and ID in exactly the same way you would mount a local snapshot,

which was described in: Mounting restartable snapshot with a new DBID and file location.

Follow the exact same procedure on the remote database hosts.

Creating remote recoverable database snapshots

Since SRDF is natively a restartable solution (that is, the replicated data includes at least

all the data, log, and control files) when creating remote recoverable database snapshots,

be sure to add the archive logs to the remote replications. As shown earlier, fra_sg can be

replicated in its own SRDF group. Assuming the same SG names are used on the local

and remote arrays, we will create remote recoverable snapshots of the database.

NOTE: Make sure that the SRDF remote device state is either ‘Synchronized’ (for SRDF/S), or

‘Consistent’ (for SRDF/A) before taking the snapshot.

Follow these steps to create a recoverable snapshot using CLI.

1. To demonstrate what data is preserved in the different scenarios, usea test table

with known records inserted before or after specific steps. To simulate user

workload during the tests, run the SLOB OLTP benchmark on the source

clustered database.

SQL> create table testTbl (Id int, Step varchar(255)) tablespace slob;

SQL> insert into testTbl values (1, 'Before +DATA & +REDO snapshot');

SQL> commit;

2. Only if hot-backup mode is used (databases pre-12c) begin hot backup mode.


IMPORTANT: If hot-backup mode is used in the production database, and the replications are

asynchronous (SRDF/A), then execute the ‘symrdf checkpoint’ command before creating the

remote database_sg snapshot. This ensures that the begin backup mode state is replicated to the

remote array before creating the remote snapshot. If hot-backup is not used (Oracle 12c),

executing ‘symrdf checkpoint’ is not needed as there is nothing to wait for.

## Only for async replications that use hot-backup mode

# symrdf -sid 48 -rdfg 10 -sg database_sg checkpoint

3. Create a remote database backup snapshot. To create a snapshot that is only

recoverable, include only the data files: data_sg. For a snapshot that is both

recoverable and restartable, use the parent SG: database_sg.

Optionally: Verify that the devices are in ‘synchronized’ state for SRDF/S

replications, or ‘consistent’, for SRDF/A replications.

# symrdf -sid 047 -rdfg 10 -sg database_sg verify -consistent

Considerations

Procedure



All devices in the group 'database_sg' are in 'Consistent' state.

# symrdf -sid 047 -rdfg 10 -sg database_sg checkpoint

# TIMESTAMP=`ssh <prod_db_host> 'echo $(date +"%Y-%m-%d_%H-%M-%S")'`

# symsnapvx -sg database_sg -name database_${TIMESTAMP} establish

4. Only if hot-backup mode is used (databases pre-12c), end hot backup mode.


5. For demonstration purposes, insert another known record into the production

database after the first remote snapshot.

SQL> insert into testTbl values (2, 'After +DATA & +REDO snapshot');

SQL> commit;

6. On production host, perform post snapshot Oracle operations.




7. Perform this step only if RMAN incremental backups are offloaded to the mount

host. In that case, the BCT file version must be switched manually on the

production host, just like RMAN would have done automatically at the end of the

backup if it was performed from the production host.

Make sure BCT is enabled, then switch its version.



-------------------------------------------------- ---------- ----------

+DATA/SLOB/CHANGETRACKING/ctf.264.957543507 ENABLED 22085632


8. Create archive logs snapshot. Create a snapshot with the archive logs (ASM

+FRA disk group, or fra_sg SG). This snapshot includes sufficient archives to

recover the database so it can open.

IMPORTANT: If SRDF/A is used, then ‘symrdf checkpoint’ command has to be executed prior to

creating the remote +FRA snapshot. This is regardless if hot-backup mode was used or not in

order to make sure the latest archive logs (after the log switch on Production) arrived to the

remote array prior to creating the remote snapshot.

Optionally, verify that the devices are in ‘synchronized’ state for SRDF/S

replications, or ‘consistent’, for SRDF/A replications.

# symrdf -sid 047 -rdfg 11 -sg fra_sg verify -consistent

All devices in the group 'fra_sg' are in 'Consistent' state.

# symrdf -sid 047 -rdfg 11 -sg fra_sg checkpoint



# TIMESTAMP=`ssh <db_host> 'echo $(date +"%Y%m%d-%H%M%S")'`

# symsnapvx -sg fra_sg -name fra_${TIMESTAMP} establish

9. For demonstration purposes, insert the last known record for this test.

SQL> insert into testTbl values (3, 'After +FRA snapshot');

SQL> commit;

10. To inspect the snapshots created, use the appropriate level of detail

symsnapvx list

symsnapvx -sg <sg_name> list

symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb –detail

symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb -summary

Mounting remote recoverable snapshot

The steps to mount a remote recoverable snapshot are identical to mounting a local

recoverable snapshot, where the only difference is that the mount host is at the remote

site. See Mounting recoverable snapshot for details.

RMAN backup offload to a remote mount host

The steps to perform the RMAN backup from a snapshot taken at the remote site are

identical to creating a backup from a snapshot taken at the local site, where the only

difference is that the mount host is at the remote site. See RMAN backup offload to a

mount host for details.

Opening a remote recoverable database on mount host

The steps to open a remote recoverable database on the mount host are identical to doing

so from the local site. The only difference is that the mount host is at the remote site. See

Opening a recoverable database on a mount host for details.

Production restore from a remote recoverable snapshot

In this use case, the production database on the local array is not in a state from which it

can be recovered. The remote array recoverable snapshot is restored over the SRDF links

to the original production devices, overwriting their data with the valid backup image

followed by database media recovery.

The remote SnapVX restore, together with the SRDF restore, work in parallel to restore

the data as fast as possible to the production database at the local site.

Note that only the data files portion of the snapshot is restored to the production

database. The assumption is that production database +REDO and +FRA are intact. If

that’s not the case, they can be restored as well, although overwriting the last production

redo logs means limited data loss. To restore only data, use the child SG: ‘data_sg’. To

Considerations

Considerations



restore both data and redo, use the parent SG: ‘database_sg’. To restore FRA, use the

‘fra_sg’ snapshot.

Be sure to read all the steps first. Especially make sure that production redo logs are not

overwritten by mistake by the snapshot restore.

Follow these steps to restore a local production database from a remote recoverable

snapshot.

1. For demonstration purposes we simulate a disaster by deleting the data files:




[oracle@dsib0144 ~]$ asmcmd

ASMCMD> rm -rf +DATA/SLOB/DATAFILE/*



PRCR-1079 : Failed to start resource ora.slob.db


following error:







following error:






CRS-2632: There are no more servers to try to place resource

'ora.slob.db' on that would satisfy its placement policy

2. Shut down the production database (if it is still running) and dismount the ASM

disk group that will be restored. Other disk groups can stay online. In this case we

restore only +DATA.

NOTE: if the target database is RAC, make sure to shutdown and dismount the ASM disk

groups and instances on all nodes.

a. On the production host, shutdown the Oracle database.


Procedure




Dismount +DATA ASM disk groups on all instances.

NOTE: Make sure only +DATA is dismounted and not +REDO or +FRA, assuming they survived

the disaster.



[oracle@dsib0144 ~]$ ssh dsib0146



b. Stop SRDF replications for the database_sg. The flag ‘-force’ is required for

SRDF/A. If consistency is enabled,disable it, because we will only restore

data_sg which is part of database_sg. Change SRDF mode to Adaptive

Copy.

# symrdf -sid 47 -rdfg 10 -sg database_sg split -force

# symrdf -sid 47 -rdfg 10 -sg database_sg disable

# symrdf -sid 47 -rdfg 10 -sg database_sg set mode acp_disk

3. List the remote snapshots (array id 047) and restore the one desired. Note that

we use ‘data_sg’ SG for the restore so that we do not overwrite redo_sg of the

production database.

NOTE: It is important to start with a SnapVX restore followed by the SRDF restore so they can

work in parallel. If SRDF restore is performed first, SnapVX restore won’t be able to take place

until it is finished or stopped.

# symsnapvx -sid 047 list -sg data_sg



----------------------------------------------------------------------------

Sym Num Flags


----- -------------------------------- ---- ------- ------------------------

00178 database_2017-11-22_11-27-29 1 .X.. .. Wed Nov 22 11:27:47 2017

database_2017-11-22_11-26-04 1 .... .. Wed Nov 22 11:26:22 2017

database_2017-11-21_16-13-25 1 .... .. Tue Nov 21 16:13:42 2017

database_2017-11-21_16-03-13 1 .... .. Tue Nov 21 16:03:30 2017

database_2017-11-21_15-14-52 1 .... .. Tue Nov 21 15:15:09 2017

00179 database_2017-11-22_11-27-29 1 .X.. .. Wed Nov 22 11:27:47 2017

database_2017-11-22_11-26-04 1 .... .. Wed Nov 22 11:26:22 2017

database_2017-11-21_16-13-25 1 .... .. Tue Nov 21 16:13:42 2017

database_2017-11-21_16-03-13 1 .... .. Tue Nov 21 16:03:30 2017

database_2017-11-21_15-14-52 1 .... .. Tue Nov 21 15:15:09 2017

...



# symsnapvx -sid 047 -sg data_sg restore -snapshot_name database_2017-11-

22_11-27-29

4. As soon as the remote SnapVX restore operation starts, start restoring SRDF.

Make sure to only restore data_sg in both cases.

# symrdf -sid 47 -rdfg 10 -sg data_sg restore

5. Data copy from both the remote snapshot and SRDF will take place in parallel.

Use the following commands to track the progress.

# symsnapvx –sid 047 -sg data_sg list -restore -gb -detail -i 60

# symrdf -sid 47 -rdfg 10 -sg data_sg que -i 60

6. Once both the SnapVX and SRDF restores are done, verify the completion.

# symsnapvx -sid 047 -sg data_sg -snapshot_name database_2017-11-22_11-

27-29 verify -restore

All devices in the group 'data_sg' are in 'Restored' state.

# symrdf -sid 047 -rdfg 10 -sg data_sg verify -synchronized

All devices in the group 'data_sg' are in 'Synchronized' state.

7. Terminate the snapshot restore session and split SRDF again.

# symsnapvx -sid 047 -sg data_sg terminate -restored -snapshot_name

database_2017-11-22_11-27-29

# symrdf -sid 047 -rdfg 10 -sg data_sg split -force

8. On production host, mount the +DATA disk group.





NAME STATE

------------------------------ -----------

DATA MOUNTED

FRA MOUNTED

GRID MOUNTED

REDO MOUNTED

9. Mount the database and perform media recovery. If hot-backup was not used

when the snapshot was created, use the ‘snapshot time’ syntax, similar to the

previous use case: Opening a recoverable database on a mount host, only this

time, the recovery takes place on the production host and not the mount host.





SQL> startup mount;


snapshot time '2017-11-22 11:27:29';


Database altered.

QL> select * from testTbl;

ID STEP

---------- ----------------------------------------



If the production database redo logs are not available, you can open the database

with reset logs.


SQL> startup mount;


Database altered.

If the production redo logs are available, apply the latest redo log transactions using

RMAN, as shown in the following example.



RMAN> shutdown immediate;


RMAN> recover database;

Starting recover at 31-OCT-17








+REDO/SLOB/ONLINELOG/group_7.262.953030783

archived log file name=+REDO/SLOB/ONLINELOG/group_7.262.953030783

thread=2 sequence=7



archived log file

name=+FRA/SLOB/ARCHIVELOG/2017_10_31/thread_1_seq_21.955.958833345

thread=1 sequence=21

Finished recover at 31-OCT-17

RMAN> alter database open resetlogs;

Statement processed

RMAN> quit

10. We can now see that the latest transactions are visible.



ID STEP

---------- ----------------------------------------



3 After +FRA snapshot

11. The database is now available for all operations and all nodes can be brought

online. If the database was opened with resetlogs, create a new recoverable

backup image immediately as the new backup base.

Chapter 7: Summary and Conclusion


Chapter 7 Summary and Conclusion


Summary ........................................................................................................... 114

Chapter 7: Summary and Conclusion


Summary

Business continuity, disaster protection, and continuous availability are important topics

for which every mission critical database environment must have a strategy. By

understanding the benefits that VMAX All Flash, SnapVX and SRDF provide, you can

develop a strong strategy to not only protect the primary databases, but also to allow fast

and efficient creation of copies for purposes such as testing, development, reporting, data

validations, backup offloading, and others.

While it is impossible to accommodate all scenarios, requirements, and deployment

cases, this white paper provides a view on some of the main strategies for data protection

and replication, and shows examples of how to implement them.

Oracle Database Backup and Recovery with VMAX3 - EMC

Documents

Transcript of Oracle Database Backup and Recovery with VMAX3 - EMC