High Availability and Performance Oracle Configuration...
Transcript of High Availability and Performance Oracle Configuration...
1
SY
MA
TE
C S
OL
UT
ION
S D
EP
LO
YM
EN
T G
UID
ES
High Availability and
Performance Oracle
Configuration with Flexible
Shared Storage in a SAN-Free
Environment using Intel SSDs
Author: Carlos Carrero
Technical Product Manager 10th December 2013 Version 7
2
Table of Contents Introduction .................................................................................................................................................. 4
Setup ............................................................................................................................................................. 4
Hardware .................................................................................................................................................. 4
Software .................................................................................................................................................... 4
Architecture .............................................................................................................................................. 5
Deployment Steps ......................................................................................................................................... 5
InfiniBand and RDMA Set UP ........................................................................................................................ 6
Linux Drivers.............................................................................................................................................. 6
Packages installation ................................................................................................................................. 7
Configuring RDMA over Infiniband ........................................................................................................... 7
Configure IP addresses .............................................................................................................................. 9
Enable Max Performance on CPUs ......................................................................................................... 12
Storage Foundation Cluster File System HA 6.1 Installation ...................................................................... 14
Packages Deployment ............................................................................................................................. 14
Configure SFCFSHA ................................................................................................................................. 16
Verify the configuration .......................................................................................................................... 20
Fencing Configuration ............................................................................................................................. 21
Password-less ssh with CP Servers ...................................................................................................... 21
Configuration ...................................................................................................................................... 22
Intel SSD Configuration ............................................................................................................................... 26
Drives configuration ................................................................................................................................ 26
Tuning SSD performance ........................................................................................................................ 28
Node 1 SetUp ...................................................................................................................................... 28
Node 2 SetUp ...................................................................................................................................... 29
Volumes and File Systems Configuration .................................................................................................... 30
Initialize and rename the internal SSD devices ....................................................................................... 31
Make internal SSD devices available to the cluster ................................................................................ 33
Create a File System for redo logs .......................................................................................................... 35
Create a File System for Data .................................................................................................................. 36
Oracle Configuration and Tuning ................................................................................................................ 38
3
Installing Oracle binaries in each node ................................................................................................... 38
Instance configuration ............................................................................................................................ 39
Oracle Disk Manager configuration ........................................................................................................ 41
Oracle redo log configuration ................................................................................................................. 42
Oracle huge pages configuration ............................................................................................................ 42
Oracle HA and Fast Failover Configuration ................................................................................................. 43
Oracle agent configuration in VCS .......................................................................................................... 43
Remove previous HA configuration for the mount points ..................................................................... 44
Service Groups and Resources for Oracle HA ......................................................................................... 44
Service Group tpcc_data ..................................................................................................................... 45
Service Group tpcc_instance............................................................................................................... 48
Fast Failover Setting ................................................................................................................................ 52
4
Introduction This document is a step by step guide to achieve a high availability and high performance environment
for Oracle databases without the need of SAN Storage. For this purpose, Symantec Cluster File System
High Availability (SFCFSHA) 6.1 will be used with the new Flexible Storage Sharing (FSS) feature. FSS
allows clustering up to 8 nodes without requiring shared storage while providing high performance and
full protection to both mission critical data and applications. To provide a high number of transactions
per second and accelerate performance, internal Solid State Drives from Intel will be used. FSS within
SFCFSHA will be used to mirror the internal data across servers, ensuring two copies (one in each server)
of the data at all times.
It is important to note that, while in this documentation FSS will be used with internal storage only, FSS
does not limit existing capabilities of using either Direct Attached Storage or SAN for larger capacity. FSS
within SFCFSHA can create hybrid models by using both internal and shared storage.
This documentation is not a supplement of the Installation and Administration Guide that should be
consulted for further information. The steps presented here just pretend to be a guide for a very specific
configuration described below.
Setup
Hardware
2 x Sandy Bridge generation servers. Each of the server nodes contains:
- 3 x Intel SSD DC S3700 (800GB) for data
- 2 x Intel SSD DC S3700 (200GB) for redo logs
- 512 GB Memory
- 40 x CPUs (2.3MHz)
- 1 x Mellanox 56Gb/s NIC card
Software
- Symantec Storage Foundation Cluster File System 6.1 GA
- Oracle 11gR2 single instance
- RedHat Enterprise Linux 6.3
5
Architecture
The configuration consists in two servers using only Intel SSDs as internal storage. There are two file systems, one for redo logs and another for data files that are accessible from the two nodes of the cluster. These are based on two clustered volumes that are striped and mirrored across the two servers. Every write will be made in parallel in the two servers, while the reads will be served locally. The two servers are interconnected using one InfiniBand 56Gb/s link Oracle single instance will be used together with Fast Failover capability for Symantec Cluster Server, so it can be restarted in a few seconds in the other node in case of failure. Three Coordination Point (CP) Servers are used to deal with arbitration in case of any split brain occurs.
. Figure 1
Deployment Steps Figure 2 outlines the different steps needed to complete the deployment. The HW setup and RHEL
installation will not be covered in this document. For the HW setup a single cable has been used to
connect the two InfiniBand cards. In case of having more than two nodes, one InfiniBand switch should
be used. The Intel SSD cards have been plugged into the internal server slots.
In the RDMA configuration section the drivers needed for the specific RedHat version used will be listed
and information about how to configure InfiniBand and RDMA will be included. Once the new IB
interfaces appear in the system, SFCFSHA can be installed and configured. The IB link will be used for
both heart beat and IO shipping across nodes. The public interface will be used as a lower priority link.
Once RDMA is configured, there is some specific tuning that will be outlined in order to get the best
performance from the Intel SSDs.
The next step will be to configure the volumes and clustered file systems for both redo logs and data
and how to make then available across the two server nodes.
The Oracle section will not cover a full Oracle deployment, but will just highlight the specific tuning and
setting that has been done for this setup.
6
Finally, Oracle will be configured within Symantec Cluster Server and Fast Failover capability will be
enabled.
Figure 2
InfiniBand and RDMA Set UP New to Symantec Cluster File System HA 6.1 is Low Latency Protocol (LLT) and GAB support for high-
speed interconnects using RDMA technology over InfiniBand or Ethernet (RoCE). LLT maintains two
channels (RDMA and non-RDMA) for each link. The RDMA channel is mainly used for data transfer and
the non-RDMA one created over UDP is used for sending and receiving heartbeats.
As described above, one Mellanox InfiniBand 56Gb/s is going to be used at each server. For a production
environment, two cards at each server are recommended, so a single point of failure is avoided.
Something to notice is that SFCFSHA will be using the native Linux drivers to manage the Host Channel
Adapter (HCA), so it is important to install the native ones and not the drivers provided by the HW
vendor. This will be the first step to be performed.
Linux Drivers
As highlighted, it is important to install and configure the native Linux drivers and not the ones provided
by Mellanox. Symantec does not yet support any external Mellanox OFED packages. Each of the servers
needs to have these packages versions (or higher) installed:
librdmacm-1.0.10-2.el6.x86_64.rpm
librdmacm-devel-1.0.10-2.el6.x86_64.rpm
7
librdmacm-utils-1.0.10-2.el6.x86_64.rpm
rdma-1.0-9.el6.noarch.rpm
libmthca-1.0.5-7.el6.x86_64.rpm
libmlx4-1.0.1-7.el6.x86_64.rpm
opensm-3.3.5-1.el6.x86_64.rpm
opensm-libs-3.3.5-1.el6.x86_64.rpm
libibumad-1.3.4-1.el6.x86_64.rpm
libibumad-devel-1.3.4-1.el6.x86_64.rpm
ibutils-1.5.4-3.el6.x86_64.rpm
infiniband-diags
perftest-1.2.3-3.el6.x86_64.rpm
libibverbs-1.1.4-2.el6.x86_64.rpm
libibverbs-devel-1.1.4-2.el6.x86_64.rpm
libibverbs-utils-1.1.4-2.el6.x86_64.rpm
The list presented here is a specific one for the RedHat distribution used during this configuration. For a
more generic list, including SUSE packages please refer to the section “Using LLT over RDMA” of the
Installation Guide.
Packages installation
During this configuration, yum is going to be used to deploy the packages needed:
# yum install librdmacm (it was already installed in our RH distribution)
# yum install librdmacm-devel
# yum install librdmacm-utils
# yum install rdma
# yum install libmthca
# yum install libmlx4
# yum install opensm (this also install opensm-libs)
# yum install libibumad (it was already installed in our RH distribution)
# yum install libibumad-devel
# yum install ibutils
# yum install infiniband-diags
# yum install perftest
# yum install libibverbs (it was already installed in our RH distribution)
# yum install libibverbs-devel (it was already installed in our RH distribution)
# yum install libibverbs-utils
Configuring RDMA over Infiniband
The InfiniBand interfaces are not visible by default until InfiniBand drivers are loaded:
8
# modprobe rdma_cm
# modprobe rdma_ucm
# modprobe mlx4_en
# modprobe mlx4_ib
# modprobe ib_mthca
# modprobe ib_ipoib
# modprobe ib_umad
Drivers loaded:
# lsmod | egrep "ib|rdma|mlx4"
ib_umad 12122 0
ib_ipoib 77230 0
ib_mthca 137429 0
rdma_ucm 13433 0
ib_uverbs 36269 1 rdma_ucm
rdma_cm 35253 1 rdma_ucm
ib_cm 37028 2 ib_ipoib,rdma_cm
iw_cm 8740 1 rdma_cm
ib_sa 22854 4 ib_ipoib,rdma_ucm,rdma_cm,ib_cm
ib_addr 6091 1 rdma_cm
mlx4_ib 55056 0
mlx4_en 70097 0
mlx4_core 177697 2 mlx4_ib,mlx4_en
ipv6 322541 198 ib_ipoib,ib_addr
ib_mad 40544 5 ib_umad,ib_mthca,ib_cm,ib_sa,mlx4_ib
ib_core 74343 11
ib_umad,ib_ipoib,ib_mthca,rdma_ucm,ib_uverbs,rdma_cm,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_
mad
[root@intel-eva2 ~]#
In order to load the drivers at boot time, modify the /etc/rdma/rdma.conf file on the operating system
with the following values:
ONBOOT=yes
RDMA_UCM_LOAD=yes
MTHCA_LOAD=yes
IPOIB_LOAD=yes
SDP_LOAD=yes
MLX4_LOAD=yes
MLX4_EN_LOAD=yes
Enable RDMA service:
# chkconfig --level 235 rdma on
9
Start OpenSM:
# /etc/init.d/opensm start
Enable Linux service to start OpenSM automatically after restart
# chkconfig --level 235 opensm on
Apply all the previous steps to the other node(s) in the cluster.
Configure IP addresses
Once the drivers are configured in both nodes, it is time to configure IP addresses for the IB interface we
are planning to use.
Verify the IB configuration at each node. First node:
[root@intel-eva1 ~]# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.10.600
Hardware version: 0
Node GUID: 0x0002c90300193bd0
System image GUID: 0x0002c90300193bd3
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 56
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02514868
Port GUID: 0x0002c90300193bd1
Link layer: InfiniBand
[root@intel-eva1 ~]#
Second node:
[root@intel-eva2 ~]# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.11.1140
Hardware version: 0
Node GUID: 0x0002c90300365bc0
System image GUID: 0x0002c90300365bc3
Port 1:
10
State: Initializing
Physical state: LinkUp
Rate: 56
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02514868
Port GUID: 0x0002c90300365bc1
Link layer: InfiniBand
[root@intel-eva2 ~]#
Now ifconfig shows the ib0 interface in our configuration:
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
The IP addresses for the internal interconnect that will be used in this setup are:
Node 1 (intel-eva1)
Link0: 192.168.27.1
Node 2 (intel-eva2)
Link0: 192.168.27.2
In order to configure an IP address, modify the file /etc/sysconfig/network-scripts/ifcfg-ib0 in
each of the nodes:
Node 1:
DEVICE="ib0"
BOOTPROTO="static"
#DHCP_HOSTNAME="intel-eva1"
HWADDR="80:00:00:48:FE:80:00:00:00:00:00:00:00:02:C9:03:00:19:3B:D1"
NM_CONTROLLED="no"
ONBOOT="yes"
TYPE="InfiniBand"
UUID="66a74acb-21d7-47b3-9b80-d57af0cab53c"
IPADDR=192.168.27.1
NETMASK=255.255.255.0
NETWORK=192.168.27.0
BROADCAST=192.168.27.255
11
Node 2:
DEVICE="ib0"
BOOTPROTO="static"
DHCP_HOSTNAME="intel-eva2"
HWADDR="80:00:00:48:FE:80:00:00:00:00:00:00:00:02:C9:03:00:36:5B:C1"
NM_CONTROLLED="no"
ONBOOT="yes"
TYPE="InfiniBand"
UUID="84fdd382-fb56-471d-93c6-ed70bb4224aa"
IPADDR=192.168.27.2
NETMASK=255.255.255.0
NETWORK=192.168.27.0
BROADCAST=192.168.27.255
And restart the network service with
# service network restart
We can see how the IP is now up at the ib0 interface at node 1.
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:192.168.27.1 Bcast:192.168.27.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Output of ifconfig for node 2:
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:192.168.27.2 Bcast:192.168.27.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Verify ping is working:
[root@intel-eva1 ~]# ping 192.168.27.2
PING 192.168.27.2 (192.168.27.2) 56(84) bytes of data.
12
64 bytes from 192.168.27.2: icmp_seq=1 ttl=64 time=0.270 ms
64 bytes from 192.168.27.2: icmp_seq=2 ttl=64 time=0.274 ms
And check the connection using ibping also. In one of the servers run:
[root@intel-eva1]# ibping –S
Note that it will not provide any response. From the other node, run ibping –G using the Port GUID
provided by ibstat command (see previous output):
[root@intel-eva2]# ibping -G 0x0002c90300193bd1
Pong from intel-eva1.(none) (Lid 2): time 0.506 ms
Pong from intel-eva1.(none) (Lid 2): time 0.549 ms
Notice the high latency shown by the ibping command. It should be under 30us, which indicates the
system may need some tuning.
Enable Max Performance on CPUs
In order to get the best performance of the IB or RoCE interconnect “Max Performance” for the servers
needs to be enabled. Depending on your server BIOS this can be achieved in different ways.
In this specific setup, the steps are:
- Restart the system and enter the BIOS settings.
- Go to BIOS menu > Launch System setup > BIOS settings > System Profile Settings > System
Profile > Max performance
In the Intel servers used in this configuration this is how it looks like after enabling “Turbo Boost
Technology”:
The CPU needs to run at the max clock rate to get the best performance. Initially that is not the case on
the default installation setup. Bellow can be seen that the real CPU MHz is 2.40GHz but it is running at
1GHz.
13
[root@intel-eva1]# cat /proc/cpuinfo | more
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 47
model name : Intel(R) Xeon(R) CPU E7- 4870 @ 2.40GHz
stepping : 2
cpu MHz : 1064.000
cache size : 30720 KB
physical id : 0
In order to fix that, edit the file /boot/grub/grub.conf and add the following to the boot parameters:
intel_idle.max_cstate=0 processor.max_cstate=1
Perform that operation in both servers and reboot them. This is how the grub.conf file looks like in this
particular setup:
[root@intel-eva1]# cat /boot/grub/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
# all kernel and initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel /vmlinuz-version ro root=/dev/sda2
# initrd /initrd-[generic-]version.img
#boot=/dev/sda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux (2.6.32-279.el6.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-279.el6.x86_64 ro root=UUID=20d1e2a7-b36c-447a-
a35b-6c30f636cbc2 rd_NO_LUKS KEYBOARDTYPE=pc KEYTABLE=us LANG=en_US.UTF-8
console=tty0 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto console=tty0
console=ttyS0,19200 rd_NO_LVM rd_NO_DM rhgb quiet intel_idle.max_cstate=0
processor.max_cstate=1
initrd /initramfs-2.6.32-279.el6.x86_64.img
[root@intel-eva1 grub]#
Additionally, make sure that each CPU is set for performance as the scaling governor. This will provide
the better performance:
https://access.redhat.com/site/documentation/en-
US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/cpufreq_setup.html
14
The servers used in this setup do not have the cpupower package installed, so we have changed the
setting manually. Verify the governors available:
[root@intel-eva1]# cat
/sys/devices/system/cpu/cpu1/cpufreq/scaling_available_governors
ondemand userspace performance
And manually set performance for each cpu:
[root@intel-eva1 cpufreq]# for i in `ls
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor`
> do
> echo $i
> echo performance > $i
> done
And verify the setting make effect:
[root@intel-eva1 cpufreq]# cat /proc/cpuinfo | grep "cpu MHz"
cpu MHz : 2395.000
cpu MHz : 2395.000
cpu MHz : 2395.000
cpu MHz : 2395.000
cpu MHz : 2395.000
…
Storage Foundation Cluster File System HA 6.1 Installation
Packages Deployment
In order to present some of the new features of the installer, and also in order to have a complete
reference, the steps taken to deploy SFCFSHA 6.1 for this particular configuration will be noted here.
As usual, run the installer script.
[root@intel-eva1 rhel6_x86_64]# ./installer
Storage Foundation and High Availability Solutions 6.1 Install Program
Symantec Product Version Installed on intel-eva1 Licensed
================================================================================
Symantec Licensing Utilities (VRTSvlic) are not installed due to which products
and licenses are not discovered.
15
Use the menu below to continue.
Task Menu:
P) Perform a Pre-Installation Check I) Install a Product
C) Configure an Installed Product G) Upgrade a Product
O) Perform a Post-Installation Check U) Uninstall a Product
L) License a Product S) Start a Product
D) View Product Descriptions X) Stop a Product
R) View Product Requirements ?) Help
Enter a Task: [P,I,C,G,O,U,L,S,D,X,R,?] I
Select to install SFCFSHA (option 5)
Storage Foundation and High Availability Solutions 6.1 Install Program
1) Symantec Dynamic Multi-Pathing (DMP)
2) Symantec Cluster Server (VCS)
3) Symantec Storage Foundation (SF)
4) Symantec Storage Foundation and High Availability (SFHA)
5) Symantec Storage Foundation Cluster File System HA (SFCFSHA)
6) Symantec Storage Foundation for Oracle RAC (SF Oracle RAC)
7) Symantec ApplicationHA (ApplicationHA)
b) Back to previous menu
Select a product to install: [1-7,b,q] 5
Agree on the EULA and select option 3 to install all rpms. All packages are needed because this option
includes Coordination Point Server packages.
Symantec Storage Foundation Cluster File System HA 6.1 Install Program
1) Install minimal required rpms - 492 MB required
2) Install recommended rpms - 769 MB required
3) Install all rpms - 793 MB required
4) Display rpms to be installed for each option
Select the rpms to be installed on all systems? [1-4,q,?] (2) 3
These are the two nodes that compose our cluster:
Enter the 64 bit RHEL6 system names separated by spaces: [q,?] (intel-eva1 intel-eva2)
The installer will verify all pre-requisites are met. If ssh passwordless has not been enabled, the installer
will allow to automatically set it up just by entering the root password of the other node.
16
Once all the pre-requisites are met, the installer will deploy packages in both nodes at the same time.
Here an example of the first steps:
Symantec Storage Foundation Cluster File System HA 6.1 Install Program
intel-eva1 intel-eva2
Logs are being written to /var/tmp/installer-201307040646nTM while installer is
in progress
Installing SFCFSHA: 13% _________________________________________
Estimated time remaining in total: (mm:ss) 2:25 4 of 30
Performing SFCFSHA preinstall tasks ............................... Done
Installing VRTSperl rpm ........................................... Done
Installing VRTSvlic rpm ........................................... Done
Installing VRTSspt rpm ............................................ Done
Installing VRTSvxvm rpm |
Once the package installation has finished, the installer will ask for a license key. These servers will be
controlled by Veritas Operation Manager (VOM), so we can enable keyless licensing (option 2).
Symantec Storage Foundation Cluster File System HA 6.1 Install Program
intel-eva1 intel-eva2
To comply with the terms of Symantec's End User License Agreement, you have 60
days to either:
* Enter a valid license key matching the functionality in use on the systems
* Enable keyless licensing and manage the systems with a Management Server. For
more details visit http://go.symantec.com/sfhakeyless. The product is fully
functional during these 60 days.
1) Enter a valid license key
2) Enable keyless licensing and complete system licensing later
How would you like to license the systems? [1-2,q] (2) 2
In this setup neither replication nor Global Cluster Option will be used.
Would you like to enable replication? [y,n,q] (n) n
Would you like to enable the Global Cluster Option? [y,n,q] (n) n
Configure SFCFSHA
Once the system has been registered, the installer will provide the option to configure the cluster. We
can postpone this action if needed and run it later using installer –config option. In this case we are
going to complete the cluster configuration in a single step.
Would you like to configure SFCFSHA on intel-eva1 intel-eva2? [y,n,q] (n) y
This setup will not be using any SAN, and therefore, Coordination Point Servers will be used for split-
brain protection. Fencing will be configured in a later section, and therefore we will answer no at this
point.
Do you want to configure I/O Fencing in enabled mode? [y,n,q,?] (y) n
17
Enter the cluster name.
Enter the unique cluster name: [q,?] intel-eva
And now the private link needs to be chosen. In previous steps, InfiniBand over RDMA has been
configured. SFCFSHA supports the use of RDMA for the private link (option 3) and this will be the option
selected in this setup to reduce latency and increase throughput.
1) Configure heartbeat links using LLT over Ethernet
2) Configure heartbeat links using LLT over UDP
3) Configure heartbeat links using LLT over RDMA
4) Automatically detect configuration for LLT over Ethernet
b) Back to previous menu
How would you like to configure heartbeat links? [1-4,b,q,?] (4) 3
For this deployment InfiniBand will be used, so option 2 will be selected.
1) Converged Ethernet (RoCE)
2) InfiniBand
b) Back to previous menu
Choose the RDMA interconnect type [1-2,b,q,?] (1) 2
The installer will verify that all IB packages needed are present. If any of those packages are not installed
the installer will indicate the missing packages. Please refer to the previous steps where InfiniBand and
RDMA setup was explained and verify all the packages needed have been properly installed. Once
correct, the installer will detect our ib0 interface.
Checking required OS rpms for LLT over RDMA on intel-eva1 ........................................ Done Checking required OS rpms for LLT over RDMA on intel-eva2 ........................................ Done Checking RDMA driver and configuration on intel-eva1 ............................................. Done Checking RDMA opensm service on intel-eva1 ....................................................... Done Checking RDMA driver and configuration on intel-eva2 ............................................. Done Checking RDMA opensm service on intel-eva2 ....................................................... Done
Configuring and starting RDMA drivers on intel-eva1 .............................................. Done
Configuring and starting RDMA drivers on intel-eva2 .............................................. Done
Checking the IP address for the RDMA enabled NICs on intel-eva1 .................................. Done
Checking the IP address for the RDMA enabled NICs on intel-eva2 .................................. Done
More detailed information about the IP address of the RDMA enabled NICs:
System RDMA NIC IP Address
================================================================================
intel-eva1 ib0 192.168.27.1
intel-eva2 ib0 192.168.27.2
18
Discovering NICs on intel-eva1 ......................................................... Discovered ib0
Enter the NIC for the first private heartbeat link (RDMA) on intel-eva1: [b,q,?] (ib0)
Select ib0 and the IP address previously configured.
Do you want to use address 192.168.27.1 for the first private heartbeat link on
intel-eva1: [y,n,q,b,?] (y) y
Enter the port for the first private heartbeat link (RDMA) on intel-eva1:
[b,q,?] (50000)
Would you like to configure a second private heartbeat link? [y,n,q,b,?] (y) n
The public network will be used a low-pri interconnect:
Enter the NIC for the low-priority heartbeat link(RDMA or UDP) on intel-eva1: [b,q,?] (eth0)
Input 'y' to go on configuring the RDMA link, input 'n' for the UDP link [y,n,q,b] (y) n
Do you want to use the address 10.182.74.220 for the low-priority heartbeat link on intel-eva1:
[y,n,q,b,?] (y)
Enter the UDP port for the low-priority heartbeat link on intel-eva1: [b,q,?] (50010)
Are you using the same NICs for private heartbeat links on all systems? [y,n,q,b,?] (y)
Do you want to use the address 192.168.27.2 for the first private heartbeat link on intel-eva2:
[y,n,q,b,?] (y)
The RDMA Port for this link: 50000
Do you want to use the address 10.182.74.221 for the low-priority heartbeat link on intel-eva2:
[y,n,q,b,?] (y)
The UDP Port for this link: 50010
Checking media speed for ib0 on intel-eva1 ................................................... 56Gb/sec
Checking media speed for ib0 on intel-eva2 ................................................... 56Gb/sec
Enter a unique cluster ID number between 0-65535: [b,q,?] (28671)
The installer will verify that cluster ID is not in use on that network:
The cluster cannot be configured if the cluster ID 28671is in use by another
cluster. Installer can perform a check to determine if the cluster ID is
duplicate. The check will take less than a minute to complete.
Would you like to check if the cluster ID is in use by another cluster? [y,n,q]
(y) y
Checking cluster ID ............................................... Done
Duplicated cluster ID detection passed. The cluster ID 28671 can be used for the cluster.
19
Press [Enter] to continue:
Verify all the information is correct and let the installer configure the cluster.
Cluster information verification:
Cluster Name: intel-eva
Cluster ID Number: 28671
Private Heartbeat NICs for intel-eva1:
link1=ib0 over RDMA
ip 192.168.27.1 netmask 255.255.255.0 port 50000
Low-Priority Heartbeat NIC for intel-eva1:
link-lowpri1=eth0 over UDP
ip 10.182.74.220 netmask 255.255.240.0 port 50010
Private Heartbeat NICs for intel-eva2:
link1=ib0 over RDMA
ip 192.168.27.2 netmask 255.255.255.0 port 50000
Low-Priority Heartbeat NIC for intel-eva2:
link-lowpri1=eth0 over UDP
ip 10.182.74.221 netmask 255.255.240.0 port 50010
Is this information correct? [y,n,q,?] (y)
A Virtual IP may be added to manage the cluster using that IP. In this setup it will not be used.
For simplification purposes of this example, secure mode will not be used. Accept the default cluster
credentials.
Would you like to configure the VCS cluster in secure mode? [y,n,q,?] (n)
No SMTP
Do you want to configure SMTP notification? [y,n,q,?] (n) n
Processes will be stopped.
Do you want to stop SFCFSHA processes now? [y,n,q,?] (y)
And the configuration will automatically start.
Logs are being written to /var/tmp/installer-201307040646nTM while installer is
in progress
Starting SFCFSHA: 16% __________________________________________
Estimated time remaining in total: (mm:ss) 2:35 4 of 24
Performing SFCFSHA configuration .................................. Done
Starting vxdmp .................................................... Done
Starting vxio ..................................................... Done
Starting vxspec ................................................... Done
Starting vxconfigd
Take note of the log directory in case any troubleshooting during the install process is needed.
20
Finally the installer will report a success install.
Symantec Storage Foundation Cluster File System HA Startup completed successfully
Verify the configuration
First verify that cluster is up and running and the two nodes are available:
[root@intel-eva1 rhel6_x86_64]# hastatus -sum
-- SYSTEM STATE
-- System State Frozen
A intel-eva1 RUNNING 0
A intel-eva2 RUNNING 0
-- GROUP STATE
-- Group System Probed AutoDisabled State
B cvm intel-eva1 Y N ONLINE
B cvm intel-eva2 Y N ONLINE
[root@intel-eva1 rhel6_x86_64]#
Given we have used RDMA for our interconnect in this deployment we are going to verify it is correct.
[root@intel-eva1 ~]# lltstat -l
LLT link information:
link 0 ib0 on rdma hipri
mtu 8192, sap 0xc350, broadcast 192.168.27.255, addrlen 4
txpkts 4657 txbytes 816981
rxpkts 4001 rxbytes 413411
latehb 0 badcksum 0 errors 0
[root@intel-eva1 ~]#
And verify the link is active.
[root@intel-eva1 ~]# lltstat -rnvv active
LLT node information:
Node State Link Status TxRDMA RxRDMA Address
* 0 intel-eva1 OPEN
ib0 UP UP UP 192.168.27.1
1 intel-eva2 OPEN
ib0 UP UP UP 192.168.27.2
[root@intel-eva1 ~]#
21
Fencing Configuration
Fencing is needed in order to protect the cluster from a split-brain situation. If the nodes loss all the
heartbeat communication, a mechanism is needed to decide whether the other server is alive and what
node will continue running the application.
FSS does not need SCSI3 keys, given that the storage is local. That overrides the risks of having a shared
configuration, where several servers can write to the same device. With FSS, there is only one final
writer, which is the node where the storage is attached. It is this node that will be in charge of
protecting the data.
In order to decide what servers should be up and running after a split-brain, arbitration via Coordination
Point (CP) Servers is needed. For a production environment, 3 CP Servers are required. For this
deployment we are going to present an example where 1 CP Server will be used. The same CP Servers
can be used for other clusters.
In this particular case, cps2 (VIP 10.182.100.137) is the server that will provide arbitration for the
cluster. This guide does not cover how to implement a CP Server, given that it is covered in other guides
and any CP Server can be reused.
Password-less ssh with CP Servers
In order to complete the configuration, the cluster node from where the installer with –config option
will be executed needs to have password-less ssh configured with the CP Servers.
[root@intel-eva1 ~]# cd /root
[root@intel-eva1 ~]# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
45:c6:a2:24:b7:fc:e5:9a:1b:ff:c6:d6:a5:ca:be:76 root@intel-eva1
The key's randomart image is:
+--[ DSA 1024]----+
| .o |
| . o .o. |
| = o .. |
| + .. |
| .So |
| . . .|
| .o . . o |
| oo .= E |
| ...**+ |
+-----------------+
22
[root@intel-eva1 .ssh]# scp id_dsa.pub cps2:/root/id_dsa_eva1.pub
The authenticity of host 'cps2 (10.182.99.207)' can't be established.
RSA key fingerprint is 36:3f:e8:93:bd:3b:e1:85:fa:13:bf:7b:87:26:29:a2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'cps2,10.182.99.207' (RSA) to the list of known hosts.
root@cps2's password:
id_dsa.pub
100% 605 0.6KB/s 00:00
[root@intel-eva1 .ssh]#
[root@intel-eva1 .ssh]# ssh cps2
root@cps2's password:
[root@cps2 ~]#
[root@cps2 .ssh]# cat /root/id_dsa_eva1.pub >> /root/.ssh/authorized_keys
[root@cps2 .ssh]# exit
logout
Connection to cps2 closed.
And verify the password-less ssh works:
[root@intel-eva1 .ssh]# ssh cps2 uname -a
Linux cps2 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64
x86_64 GNU/Linux
[root@intel-eva1 .ssh]#
Configuration
Call the installer with –fencing option:
[root@intel-eva1 ~]# cd /opt/VRTS/install
[root@intel-eva1 install]# ./installsfcfsha61 -fencing
Enter the name of one system in the VCS cluster for which you would like to configure I/O fencing:
intel-eva1 intel-eva2
Cluster information verification:
Cluster Name: intel-eva-clus
Cluster ID Number: 60717
Systems: intel-eva1 intel-eva2
Would you like to configure I/O fencing on the cluster? [y,n,q] (y)
23
For FSS with no shared storage, option 1 will be chosen:
Fencing configuration
1) Configure Coordination Point client based fencing
2) Configure disk based fencing
3) Configure fencing in disabled mode
4) Replace/Add/Remove coordination points
5) Refresh keys/registrations on the existing coordination points
6) Set the order of existing coordination points
Select the fencing mechanism to be configured in this Application Cluster: [1-6,q,?] 1
This is our first deployment, so there are no issues restarting VCS. If fencing is being enabled after some
resources have already being created, these will have to be restarted
This I/O fencing configuration option requires a restart of VCS. Installer will stop VCS at a later
stage in this
run. Note that the service groups will be online only on the systems that are in the 'AutoStartList'
after restarting VCS. Do you want to continue? [y,n,q,b,?] y
We are using local storage and therefore we do not have shared storage with SCSI3 PR
Does your storage environment support SCSI3 PR? [y,n,q,b,?] n
Non-SCSI3 fencing will be configured
In this environment, either Non-SCSI3 fencing can be configured or fencing can be configured in
disabled mode
Do you want to configure Non-SCSI3 fencing? [y,n,q,b] (y) y
3 CP Servers are required for a production environment. For this setup, only 1 CP Server will be used
Enter the total number of coordination points. All coordination points should be Coordination Point
servers: [b]
(3) 1
In case your CP Server is connected via several networks, you can add all of them to the configuration,
so the cluster nodes will try to reach the CP Servers on different interfaces. For this configuration we
have single network to connect to the CP Server. You will also need to enter the Virtual IP the CP Server
is listening on
Press [Enter] to continue:
You are now going to be asked for the Virtual IP addresses or fully qualified host names of the
Coordination Point
Servers. Note that the installer assumes these values to be the identical as viewed from all the client
cluster
nodes.
Press [Enter] to continue:
24
How many IP addresses would you like to use to communicate to Coordination Point Server #1? [b,q,?] (1)
1
Enter the Virtual IP address or fully qualified host name #1 for the HTTPS Coordination Point Server
#1: [b] 10.182.100.137
IF the previous step to configure password-less rsh or ssh were not successful, this message may appear
at this point. Make sure that you can ssh without password from the node where you are running the
fencing configuration and the CP Servers
Enter the Virtual IP address or fully qualified host name #1 for the HTTPS Coordination Point Server
#1: [b] 10.182.100.137
Cannot communicate with system 10.182.100.137. Make sure password-less rsh or ssh is configured or the
CP Server is
up and running.
Enter the Virtual IP address or fully qualified host name #1 for the HTTPS Coordination Point Server
#1: [b]
If password-less ssh was enabled as expected then the configuration can continue. Accept the default
port
Enter the Virtual IP address or fully qualified host name #1 for the HTTPS Coordination Point Server
#1: [b] 10.182.100.137
Enter the port that the coordination point server 10.182.100.137 would be listening on or accept the
default port
suggested: [b] (443)
Review the configuration is correct:
CPS based fencing configuration: Coordination points verification
Total number of coordination points being used: 1
Coordination Point Server ([VIP or FQHN]:Port):
1. 10.182.100.137 ([10.182.100.137]:443)
Is this information correct? [y,n,q] (y)
CPS based fencing configuration: Client cluster verification
CPS Admin utility : /opt/VRTScps/bin/cpsadm
Cluster ID: 60717
Cluster Name: intel-eva-clus
UUID for the above cluster: {313a17b2-1dd2-11b2-a040-a4616f743004}
Is this information correct? [y,n,q] (y)
25
Verify that all the registrations have happened:
Updating client cluster information on Coordination Point Server 10.182.100.137
Adding the client cluster to the Coordination Point Server 10.182.100.137 .................. Done
Registering client node intel-eva1 with Coordination Point Server 10.182.100.137 ........... Done
Adding CPClient user for communicating to Coordination Point Server 10.182.100.137 ......... Done
Adding cluster intel-eva-clus to the CPClient user on Coordination Point Server 10.182.100.137 ...
Done
Registering client node intel-eva2 with Coordination Point Server 10.182.100.137 ........... Done
Adding CPClient user for communicating to Coordination Point Server 10.182.100.137 ......... Done
Adding cluster intel-eva-clus to the CPClient user on Coordination Point Server 10.182.100.137 ...
Done
Installer will stop VCS before applying fencing configuration. To make sure VCS shuts down
successfully, unfreeze
any frozen service group and unmount the mounted file systems in the cluster.
Are you ready to stop VCS and apply fencing configuration on all nodes at this time? [y,n,q] (y)
VCS will be restarted and fencing configuration will be applied
Stopping VCS on intel-eva2 ................................................................... Done
Stopping Fencing on intel-eva2 ............................................................... Done
Stopping VCS on intel-eva1 ................................................................... Done
Stopping Fencing on intel-eva1 ............................................................... Done
Updating /etc/vxfenmode file on intel-eva1 ................................................... Done
Updating /etc/vxenviron file on intel-eva1 ................................................... Done
Updating /etc/sysconfig/vxfen file on intel-eva1 ............................................. Done
Updating /etc/llttab file on intel-eva1 ...................................................... Done
Updating /etc/vxfenmode file on intel-eva2 ................................................... Done
Updating /etc/vxenviron file on intel-eva2 ................................................... Done
Updating /etc/sysconfig/vxfen file on intel-eva2 ............................................. Done
Updating /etc/llttab file on intel-eva2 ...................................................... Done
Starting Fencing on intel-eva1 ............................................................... Done
Starting Fencing on intel-eva2 ............................................................... Done
Updating main.cf with fencing ................................................................ Done
Starting VCS on intel-eva1 ................................................................... Done
Starting VCS on intel-eva2 ................................................................... Done
The Coordination Point Agent monitors the registrations on the coordination points.
Do you want to configure Coordination Point Agent on the client cluster? [y,n,q] (y)
It is recommended to configure the Coordination Point Agent on the cluster. The goal is to proactively
detect any anomaly with the CP Servers
Do you want to configure Coordination Point Agent on the client cluster? [y,n,q] (y) y
Enter a non-existing name for the service group for Coordination Point Agent: [b] (vxfen)
Adding Coordination Point Agent via intel-eva1 .............................................. Done
I/O Fencing configuration ................................................................... Done
26
I/O Fencing configuration completed successfully
And finally you can verify fencing has been enabled
[root@intel-eva1 install]# vxfenadm -d
I/O Fencing Cluster Information:
================================
Fencing Protocol Version: 201
Fencing Mode: Customized
Fencing Mechanism: cps
Cluster Members:
* 0 (intel-eva1)
1 (intel-eva2)
RFSM State Information:
node 0 in state 8 (running)
node 1 in state 8 (running)
[root@intel-eva1 install]#
Intel SSD Configuration
Drives configuration
This section explains the steps given to configure the five Intel SSD cards used in each node.
The RAID controller used is
07:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108
[Liberator] (rev 03)
The RAID controller is managed by the MegaCLI utility, which can be downloaded from LSI support page:
http://www.lsi.com/support/Pages/download-results.aspx?keyword=MegaCli
Install the latest MegaCLI package: MegaCli-8.07.10-1.noarch.rpm
[root@intel-eva2 Linux MegaCLI 8.07.10]# rpm -ivh MegaCli-8.07.10-1.noarch.rpm
Preparing... ########################################### [100%]
1:MegaCli ########################################### [100%]
[root@intel-eva2 Linux MegaCLI 8.07.10]#
If the SSDs are not visible after reboot, enter Control + G during boot to enter the controller
configuration and make the SSDs accessible.
Verify that the Operating System is able to detect them:
27
[root@intel-eva1]# dmesg | grep SSD
scsi 0:0:0:0: Direct-Access ATA INTEL SSDSC2BA80 0265 PQ: 0 ANSI: 5
scsi 0:0:2:0: Direct-Access ATA INTEL SSDSC2BA80 0265 PQ: 0 ANSI: 5
scsi 0:0:3:0: Direct-Access ATA INTEL SSDSC2BA80 0265 PQ: 0 ANSI: 5
scsi 0:0:4:0: Direct-Access ATA INTEL SSDSC2BA20 0265 PQ: 0 ANSI: 5
scsi 0:0:5:0: Direct-Access ATA INTEL SSDSC2BA20 0265 PQ: 0 ANSI: 5
scsi 0:0:6:0: Direct-Access ATA INTEL SSDSC2BA80 0265 PQ: 0 ANSI: 5
Using MegaCli64 utility the slot number, capacity, ID etc can be verified. Here some of the fields for the
four local SSDs:
[root@intel-eva1]# /opt/MegaRAID/MegaCli/MegaCli64 -PDlist -a0 | egrep "Device
ID|Slot Number|WWN|PD Type|Raw Size"
Enclosure Device ID: 252
Slot Number: 0
WWN: 5000C5000CB3B3A0
PD Type: SAS
Raw Size: 136.732 GB [0x11177330 Sectors]
Enclosure Device ID: 252
Slot Number: 1
WWN: 50015178f361d7dd
PD Type: SATA
Raw Size: 745.211 GB [0x5d26ceb0 Sectors]
Enclosure Device ID: 252
Slot Number: 2
WWN: 50015178f361d5fd
PD Type: SATA
Raw Size: 745.211 GB [0x5d26ceb0 Sectors]
Enclosure Device ID: 252
Slot Number: 3
WWN: 50015178f361d807
PD Type: SATA
Raw Size: 745.211 GB [0x5d26ceb0 Sectors]
Enclosure Device ID: 252
Slot Number: 4
WWN: 50015178f361d642
PD Type: SATA
Raw Size: 745.211 GB [0x5d26ceb0 Sectors]
Enclosure Device ID: 252
Slot Number: 5
WWN: 50015178f355f328
PD Type: SATA
Raw Size: 186.310 GB [0x1749f1b0 Sectors]
Enclosure Device ID: 252
Slot Number: 6
WWN: 5000C5000CB3BAE8
PD Type: SAS
Raw Size: 136.732 GB [0x11177330 Sectors]
28
Enclosure Device ID: 252
Slot Number: 7
WWN: 50015178f355f0ce
PD Type: SATA
Raw Size: 186.310 GB [0x1749f1b0 Sectors]
If the devices are not accessible to the OS, MegaCli64 can be used to add them. This is one example:
[root@intel-eva1]# ./MegaCli64 -CfgLdAdd -r0 [252:1] -a0
Adapter 0: Created VD 1
Adapter 0: Configured the Adapter!!
Exit Code: 0x00
Verify the OS can see them:
[root@intel-eva1 ~]# lsscsi | grep INTEL
[0:2:0:0] disk INTEL RS2BL080 2.12 /dev/sda
[0:2:1:0] disk INTEL RS2BL080 2.12 /dev/sdb
[0:2:2:0] disk INTEL RS2BL080 2.12 /dev/sdc
[0:2:3:0] disk INTEL RS2BL080 2.12 /dev/sdd
[0:2:4:0] disk INTEL RS2BL080 2.12 /dev/sde
[0:2:5:0] disk INTEL RS2BL080 2.12 /dev/sdf
[0:2:6:0] disk INTEL RS2BL080 2.12 /dev/sdg
Tuning SSD performance
Here some optimizations for Linux IO
Use noop/deadline (default is cfq) at /sys/block/sdX/queue/scheduler
Turn rotational=0
Turn off read_ahead_kb=0
First identify the physical name of each of the SSDs plugged in each sever.
Node 1 SetUp
[root@intel-eva1 ~]# vxdisk -e list
DEVICE TYPE DISK GROUP STATUS
OS_NATIVE_NAME ATTR
disk_1 auto:cdsdisk - - online sdd
-
disk_2 auto:cdsdisk - - online sde
-
29
disk_3 auto:cdsdisk - - online sdb
-
disk_4 auto:cdsdisk - - online sdc
-
disk_5 auto:cdsdisk - - online sdf
-
disk_6 auto:cdsdisk - - online sdg
-
Verify which the default values are:
[root@intel-eva1 queue]# pwd
/sys/block/sdb/queue
[root@intel-eva1 queue]# cat rotational
1
[root@intel-eva1 queue]# cat read_ahead_kb
128
[root@intel-eva1 queue]# cat scheduler
noop anticipatory deadline [cfq]
[root@intel-eva1 queue]#
And now modify the values for the SSDs:
# echo deadline > /sys/block/sdd/queue/scheduler
# echo deadline > /sys/block/sde/queue/scheduler
# echo deadline > /sys/block/sdb/queue/scheduler
# echo deadline > /sys/block/sdc/queue/scheduler
# echo deadline > /sys/block/sdf/queue/scheduler
# echo deadline > /sys/block/sdg/queue/scheduler
# echo 0 > /sys/block/sdd/queue/rotational
# echo 0 > /sys/block/sde/queue/rotational
# echo 0 > /sys/block/sdb/queue/rotational
# echo 0 > /sys/block/sdc/queue/rotational
# echo 0 > /sys/block/sdf/queue/rotational
# echo 0 > /sys/block/sdg/queue/rotational
# echo 0 > /sys/block/sdd/queue/read_ahead_kb
# echo 0 > /sys/block/sde/queue/read_ahead_kb
# echo 0 > /sys/block/sdb/queue/read_ahead_kb
# echo 0 > /sys/block/sdc/queue/read_ahead_kb
# echo 0 > /sys/block/sdf/queue/read_ahead_kb
# echo 0 > /sys/block/sdg/queue/read_ahead_kb
Node 2 SetUp
[root@intel-eva2 init.d]# vxdisk -e list
30
DEVICE TYPE DISK GROUP STATUS
OS_NATIVE_NAME ATTR
disk_1 auto:cdsdisk - - online sdb
-
disk_2 auto:cdsdisk - - online sdc
-
disk_3 auto:cdsdisk - - online sdd
-
disk_4 auto:cdsdisk - - online sde
-
disk_5 auto:cdsdisk - - online sdf
-
disk_6 auto:cdsdisk - - online sdg
-
Modify the values for the SSDs:
# echo deadline > /sys/block/sdd/queue/scheduler
# echo deadline > /sys/block/sde/queue/scheduler
# echo deadline > /sys/block/sdb/queue/scheduler
# echo deadline > /sys/block/sdc/queue/scheduler
# echo deadline > /sys/block/sdf/queue/scheduler
# echo deadline > /sys/block/sdg/queue/scheduler
# echo 0 > /sys/block/sdd/queue/rotational
# echo 0 > /sys/block/sde/queue/rotational
# echo 0 > /sys/block/sdb/queue/rotational
# echo 0 > /sys/block/sdc/queue/rotational
# echo 0 > /sys/block/sdf/queue/rotational
# echo 0 > /sys/block/sdg/queue/rotational
# echo 0 > /sys/block/sdd/queue/read_ahead_kb
# echo 0 > /sys/block/sde/queue/read_ahead_kb
# echo 0 > /sys/block/sdb/queue/read_ahead_kb
# echo 0 > /sys/block/sdc/queue/read_ahead_kb
# echo 0 > /sys/block/sdf/queue/read_ahead_kb
# echo 0 > /sys/block/sdg/queue/read_ahead_kb
Volumes and File Systems Configuration Two file systems are going to be created. One will be used for Oracle data under /tpccdata and other
for Oracle redo logs under /tpcclog.
The /tpccdata file system will have one stripe across the 3 x 745GB SSDs and it will be mirrored to the
other node. The /tpcclog file system will use one stripe across the 2 x 186GB SSDs and it will also be
mirrored for redundancy.
31
Given that the disks are named from disk_1 to disk_6, and that there is a mix of sizes, the first step will
be to identify each of them. For clarity reasons, we are going to rename them to make their utilization
easier.
Initialize and rename the internal SSD devices
Initialize the SSDs in each of the nodes. Note that in our configuration we have an extra SSD 800GB
device, but that is not going to be used in our setup. Example for the first node node.
[root@intel-eva1 ~]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
disk_1 auto:none - - online invalid
disk_2 auto:none - - online invalid
disk_3 auto:none - - online invalid
disk_4 auto:none - - online invalid
disk_5 auto:none - - online invalid
disk_6 auto:none - - online invalid
Use vxdisksetup command to write the label.
[root@intel-eva1 ~]# vxdisksetup -i disk_1
[root@intel-eva1 ~]# vxdisksetup -i disk_2
[root@intel-eva1 ~]# vxdisksetup -i disk_3
[root@intel-eva1 ~]# vxdisksetup -i disk_4
[root@intel-eva1 ~]# vxdisksetup -i disk_5
[root@intel-eva1 ~]# vxdisksetup -i disk_6
Verify disks have been initialized.
[root@intel-eva1 ~]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
disk_1 auto:cdsdisk - - online
disk_2 auto:cdsdisk - - online
disk_3 auto:cdsdisk - - online
disk_4 auto:cdsdisk - - online
disk_5 auto:cdsdisk - - online
disk_6 auto:cdsdisk - - online
Repeat the same steps in the second node.
Now we are going to identify the capacity for each device.
[root@intel-eva1 ~]# for i in 1 2 3 4 5 6; do echo "disk_$i"; vxdisk list disk_$i
| grep public ; done
disk_1
public: slice=3 offset=65792 len=1560343712 disk_offset=0
disk_2
public: slice=3 offset=65792 len=1560343712 disk_offset=0
32
disk_3
public: slice=3 offset=65792 len=1560430032 disk_offset=0
disk_4
public: slice=3 offset=65792 len=1560343712 disk_offset=0
disk_5
public: slice=3 offset=65792 len=388593808 disk_offset=0
disk_6
public: slice=3 offset=65792 len=388593808 disk_offset=0
[root@intel-eva1 ~]#
The len is specified in sector units, so 388593808 = 185GB and 1560343712 = 744GB
We are going to rename those disks according to the following table:
Original Name Size New name
disk_1 744GB Intel_SSD_744_1
disk_2 744GB Intel_SSD_744_2
disk_3 744GB Intel_SSD_744_3
disk_4 744GB
disk_5 185GB Intel_SSD_185_1
disk_6 185GB Intel_SSD_185_1
Use vxdmpadm command to rename those devices:
[root@intel-eva1 ~]# vxdmpadm setattr dmpnode disk_1 name=Intel_SSD_744_1
[root@intel-eva1 ~]# vxdmpadm setattr dmpnode disk_2 name=Intel_SSD_744_2
[root@intel-eva1 ~]# vxdmpadm setattr dmpnode disk_3 name=Intel_SSD_744_3
[root@intel-eva1 ~]# vxdmpadm setattr dmpnode disk_5 name=Intel_SSD_185_1
[root@intel-eva1 ~]# vxdmpadm setattr dmpnode disk_6 name=Intel_SSD_185_2
And verify the changes:
[root@intel-eva1 ~]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
Intel_SSD_185_1 auto:cdsdisk - - online
Intel_SSD_185_2 auto:cdsdisk - - online
Intel_SSD_744_1 auto:cdsdisk - - online
Intel_SSD_744_2 auto:cdsdisk - - online
Intel_SSD_744_3 auto:cdsdisk - - online
disk_4 auto:cdsdisk - - online
Run the same steps on the second node, taking note of the sizes and rename accordingly. These are the
specific steps for our configuration:
[root@intel-eva2]# for i in 1 2 3 4 5 6; do echo "disk_$i"; vxdisk list disk_$i |
grep public ; done
33
disk_1
public: slice=3 offset=65792 len=388593808 disk_offset=0
disk_2
public: slice=3 offset=65792 len=388593808 disk_offset=0
disk_3
public: slice=3 offset=65792 len=1560430032 disk_offset=0
disk_4
public: slice=3 offset=65792 len=1560430032 disk_offset=0
disk_5
public: slice=3 offset=65792 len=1560430032 disk_offset=0
disk_6
public: slice=3 offset=65792 len=1560430032 disk_offset=0
We are going to rename those disks according to the following table:
Original Name Size New name
disk_1 185GB Intel_SSD_185_1
disk_2 185GB Intel_SSD_185_2
disk_3 744GB Intel_SSD_744_1
disk_4 744GB
disk_5 744GB Intel_SSD_744_1
disk_6 744GB Intel_SSD_744_1
[root@intel-eva2]# vxdmpadm setattr dmpnode disk_1 name=Intel_SSD_185_1
[root@intel-eva2]# vxdmpadm setattr dmpnode disk_2 name=Intel_SSD_185_2
[root@intel-eva2]# vxdmpadm setattr dmpnode disk_3 name=Intel_SSD_744_1
[root@intel-eva2]# vxdmpadm setattr dmpnode disk_5 name=Intel_SSD_744_2
[root@intel-eva2]# vxdmpadm setattr dmpnode disk_6 name=Intel_SSD_744_3
And the final naming configuration:
[root@intel-eva2]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
Intel_SSD_185_1 auto:cdsdisk - - online
Intel_SSD_185_2 auto:cdsdisk - - online
Intel_SSD_744_1 auto:cdsdisk - - online
Intel_SSD_744_2 auto:cdsdisk - - online
Intel_SSD_744_3 auto:cdsdisk - - online
disk_4 auto:cdsdisk - - online
Make internal SSD devices available to the cluster
34
Each of the nodes within our SFCFSHA setup have to export those five SSDs so they are visible across the
other cluster node. Use vxdisk command as follows. First in the node 1:
[root@intel-eva1 ~]# vxdisk export Intel_SSD_185_1 Intel_SSD_185_2 Intel_SSD_744_1
Intel_SSD_744_2 Intel_SSD_744_3
And then export the SSDs from second node.
[root@intel-eva2 ~]# vxdisk export Intel_SSD_185_1 Intel_SSD_185_2 Intel_SSD_744_1
Intel_SSD_744_2 Intel_SSD_744_3
Now run vxdisk list in each of the nodes and verify that the local SSD appears as “online exported”,
while the remote one appears as “online remote”.
On the first node:
[root@intel-eva1 ~]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
Intel_SSD_185_1 auto:cdsdisk - - online exported
Intel_SSD_185_2 auto:cdsdisk - - online exported
Intel_SSD_744_1 auto:cdsdisk - - online exported
Intel_SSD_744_2 auto:cdsdisk - - online exported
Intel_SSD_744_3 auto:cdsdisk - - online exported
disk_4 auto:cdsdisk - - online
intel-eva2_Intel_SSD_185_1 auto:cdsdisk - - online remote
intel-eva2_Intel_SSD_185_2 auto:cdsdisk - - online remote
intel-eva2_Intel_SSD_744_1 auto:cdsdisk - - online remote
intel-eva2_Intel_SSD_744_2 auto:cdsdisk - - online remote
intel-eva2_Intel_SSD_744_3 auto:cdsdisk - - online remote
sda auto:none - - online invalid
On the second node:
[root@intel-eva2 ~]# vxdisk list
DEVICE TYPE DISK GROUP STATUS
Intel_SSD_185_1 auto:cdsdisk - - online exported
Intel_SSD_185_2 auto:cdsdisk - - online exported
Intel_SSD_744_1 auto:cdsdisk - - online exported
Intel_SSD_744_2 auto:cdsdisk - - online exported
Intel_SSD_744_3 auto:cdsdisk - - online exported
disk_4 auto:cdsdisk - - online
intel-eva1_Intel_SSD_185_1 auto:cdsdisk - - online remote
intel-eva1_Intel_SSD_185_2 auto:cdsdisk - - online remote
intel-eva1_Intel_SSD_744_1 auto:cdsdisk - - online remote
intel-eva1_Intel_SSD_744_2 auto:cdsdisk - - online remote
intel-eva1_Intel_SSD_744_3 auto:cdsdisk - - online remote
sda auto:none - - online invalid
35
Create a File System for redo logs
From one of the nodes, we are going to create a disk group of fss type. This disk group will have a plex in
each node, so the data will be protected across the cluster nodes. Each write will be done in parallel to
the two nodes, while the reads will be served from the local node.
Create tpcclog01 diskgroup using the two internal plus two remote 185GB disks:
[root@intel-eva1 ~]# vxdg -o fss -s init tpcclog01 Intel_SSD_185_1 Intel_SSD_185_2
intel-eva2_Intel_SSD_185_1 intel-eva2_Intel_SSD_185_2
Create tpcclog volume:
[root@intel-eva1 ~]# vxassist -g tpcclog01 make tpcclog maxsize ncolumns=2
layout=mirror-stripe
Because the disk group has been created as FSS type, vxassist command will automatically create a
plex in each node and it will include a DCO volume for Fast Mirror Resync (FMR).
[root@intel-eva1 ~]# vxprint -g tpcclog01
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dg tpcclog01 tpcclog01 - - - - - -
dm Intel_SSD_185_1 Intel_SSD_185_1 - 388593808 - - - -
dm Intel_SSD_185_2 Intel_SSD_185_2 - 388593808 - - - -
dm intel-eva2_Intel_SSD_185_1 intel-eva2_Intel_SSD_185_1 - 388593808 - REMOTE - -
dm intel-eva2_Intel_SSD_185_2 intel-eva2_Intel_SSD_185_2 - 388593808 - REMOTE - -
v tpcclog fsgen ENABLED 776925184 - ACTIVE - -
pl tpcclog-01 tpcclog ENABLED 776925184 - ACTIVE - -
sd Intel_SSD_185_1-01 tpcclog-01 ENABLED 388462592 0 - - -
sd Intel_SSD_185_2-01 tpcclog-01 ENABLED 388462592 0 - - -
pl tpcclog-02 tpcclog ENABLED 776925184 - ACTIVE - -
sd intel-eva2_Intel_SSD_185_1-01 tpcclog-02 ENABLED 388462592 0 - - -
sd intel-eva2_Intel_SSD_185_2-01 tpcclog-02 ENABLED 388462592 0 - - -
dc tpcclog_dco tpcclog - - - - - -
v tpcclog_dcl gen ENABLED 94720 - ACTIVE - -
pl tpcclog_dcl-01 tpcclog_dcl ENABLED 94720 - ACTIVE - -
sd Intel_SSD_185_1-02 tpcclog_dcl-01 ENABLED 94720 0 - - -
pl tpcclog_dcl-02 tpcclog_dcl ENABLED 94720 - ACTIVE - -
sd intel-eva2_Intel_SSD_185_1-02 tpcclog_dcl-02 ENABLED 94720 0 - - -
It is important to notice that a mirror-stripe supports FMR, while a stripe-mirror is a layered volume
setup which currently does not support FMR.
36
Create a file system with bsize=4096 for the redo logs:
[root@intel-eva1 ~]# mkfs -t vxfs -o bsize=4096 /dev/vx/rdsk/tpcclog01/tpcclog
version 10 layout
776925184 sectors, 97115648 blocks of size 4096, log size 16384 blocks
rcq size 2048 blocks
largefiles supported
maxlink supported
Add the mount point to the cluster configuration:
[root@intel-eva1 ~]# cfsmntadm add tpcclog01 tpcclog /tpcclog all=crw
Mount Point is being added...
/tpcclog added to the cluster-configuration
Mount it:
[root@intel-eva1 ~]# cfsmount /tpcclog
Mounting...
[/dev/vx/dsk/tpcclog01/tpcclog] mounted successfully at /tpcclog on intel-eva1
[/dev/vx/dsk/tpcclog01/tpcclog] mounted successfully at /tpcclog on intel-eva2
[root@intel-eva1 ~]#
And provide permission for oracle user:
[root@intel-eva1 ]# chown oracle:oinstall /tpcclog
Create a File System for Data
Create tpccdata01 diskgroup using the three internal plus three remote SSDs for 744GB:
[root@intel-eva1 ~]# vxdg -o fss -s init tpccdata01 Intel_SSD_744_1
Intel_SSD_744_2 Intel_SSD_744_3 intel-eva2_Intel_SSD_744_1 intel-
eva2_Intel_SSD_744_2 intel-eva2_Intel_SSD_744_3
Create a 2TB volume striped across the three local SSDs and mirrored across the remote SSDs located in
the other node.
[root@intel-eva1 ~]# vxassist -g tpccdata01 make tpccdata 2T ncolumns=3
layout=mirror-stripe
Verify the configuration.
[root@intel-eva1 ~]# vxprint -g tpccdata01
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
37
dg tpccdata01 tpccdata01 - - - - - -
dm Intel_SSD_744_1 Intel_SSD_744_1 - 1560343712 - - - -
dm Intel_SSD_744_2 Intel_SSD_744_2 - 1560343712 - - - -
dm Intel_SSD_744_3 Intel_SSD_744_3 - 1560430032 - - - -
dm intel-eva2_Intel_SSD_744_1 intel-eva2_Intel_SSD_744_1 - 1560430032 - REMOTE - -
dm intel-eva2_Intel_SSD_744_2 intel-eva2_Intel_SSD_744_2 - 1560430032 - REMOTE - -
dm intel-eva2_Intel_SSD_744_3 intel-eva2_Intel_SSD_744_3 - 1560430032 - REMOTE - -
v tpccdata fsgen ENABLED 4294967296 - ACTIVE - -
pl tpccdata-01 tpccdata ENABLED 4294967424 - ACTIVE - -
sd Intel_SSD_744_1-01 tpccdata-01 ENABLED 1431655808 0 - - -
sd Intel_SSD_744_2-01 tpccdata-01 ENABLED 1431655808 0 - - -
sd Intel_SSD_744_3-01 tpccdata-01 ENABLED 1431655808 0 - - -
pl tpccdata-02 tpccdata ENABLED 4294967424 - ACTIVE - -
sd intel-eva2_Intel_SSD_744_1-01 tpccdata-02 ENABLED 1431655808 0 - - -
sd intel-eva2_Intel_SSD_744_2-01 tpccdata-02 ENABLED 1431655808 0 - - -
sd intel-eva2_Intel_SSD_744_3-01 tpccdata-02 ENABLED 1431655808 0 - - -
dc tpccdata_dco tpccdata - - - - - -
v tpccdata_dcl gen ENABLED 217216 - ACTIVE - -
pl tpccdata_dcl-01 tpccdata_dcl ENABLED 217216 - ACTIVE - -
sd Intel_SSD_744_1-02 tpccdata_dcl-01 ENABLED 217216 0 - - -
pl tpccdata_dcl-02 tpccdata_dcl ENABLED 217216 - ACTIVE - -
sd intel-eva2_Intel_SSD_744_1-02 tpccdata_dcl-02 ENABLED 217216 0 - - -
[root@intel-eva1 ~]#
And create a file system using bsize=8192.
[root@intel-eva1 ~]# mkfs -t vxfs -o bsize=8192 /dev/vx/rdsk/tpccdata01/tpccdata
version 10 layout
4294967296 sectors, 268435456 blocks of size 8192, log size 32768 blocks
rcq size 8192 blocks
largefiles supported
maxlink supported
Add the mount point to the cluster:
[root@intel-eva1 ~]# cfsmntadm add tpccdata01 tpccdata all=crw
Mount Point is being added...
/tpccdata added to the cluster-configuration
[root@intel-eva1 ~]#
And mount it:
[root@intel-eva1 ~]# cfsmount /tpccdata
Mounting...
[/dev/vx/dsk/tpccdata01/tpccdata] mounted successfully at /tpccdata on intel-
eva1
38
[/dev/vx/dsk/tpccdata01/tpccdata] mounted successfully at /tpccdata on intel-
eva2
Finally add Oracle permissions for the folder:
[root@intel-eva1 ]# chown oracle:oinstall /tpccdata
Oracle Configuration and Tuning This section will highlight some key points during Oracle configuration that needs to be followed. It is
not a complete guide to install and configure Oracle. It will highlight some tuning to get the best
performance from the Intel SSD devices.
Installing Oracle binaries in each node
Oracle Single instance will be used in this configuration. This will provide a simpler and cheaper
configuration. With Fast Failover VCS capabilities and the utilization of Intel SSDs as the storage, we will
be creating a high resilience configuration with very fast recover capabilities.
If none of the enterprise features is going to be used, and given the number of sockets we have in this
setup (4), the Standard Edition can be used:
39
Oracle binaries will be installed under /oracle folder in each node of the cluster
Perform the same steps in the second node of the cluster to install the binaries locally
Instance configuration
The Database Configuration Assistant (dbca) from Oracle will be used to configure the first instance that
will be used. Here the steps needed to create the instance on the file systems that has been created will
be specified.
The first step taken is to configure a listener using netca tool. LISTENER will be created:
[oracle@intel-eva1]$ netca
Oracle Net Services Configuration:
Configuring Listener:LISTENER
Listener configuration complete.
Oracle Net Listener Startup:
Running Listener Control:
/oracle/product/11.2.0/dbhome_1/bin/lsnrctl start LISTENER
Listener Control complete.
Listener started successfully.
Oracle Net Services configuration successful. The exit code is 0
The database instance will be identified as tpcc:
40
At the step 6 of the installer, make sure to select File System as the storage type, and enter /tpccdata
as the folder to store the database files:
At step 10, click on each of the Redo Log Groups and modify the directory name to /tpcclog for 1, 2
and 3:
41
Finally, the folder $ORACLE_BASE/admin/SID needs to be copied to node 2
Oracle Disk Manager configuration
Oracle Disk Manager (ODM) provides support for Oracle’s file management and I/O calls for database
storage on VxFS file systems. ODM provides true kernel asynchronous I/O for files, reduce system call
overhead, improve file system layout by preallocating contiguous files on a VxFS file system and
provides performance on file systems that is equivalent to raw devices.
ODM is transparent for users once it is enabled.
First make sure the Oracle database is down.
Then we need to create a link from the Oracle directory to VRTSodm one. Make sure this file exists:
$ ls -l /opt/VRTSodm/lib64/libodm.so
-rwxr-xr-x. 1 bin bin 71358 Sep 30 03:42 libodm.so
Then go to $ORACLE_HOME/lib folder (/oracle/product/11.2.0/dbhome_1/lib) in our case and
check the libodm library:
[oracle@intel-eva1]$ cd $ORACLE_HOME/lib
[oracle@intel-eva1]$ ls -l libodm*
-rw-r--r--. 1 oracle oinstall 7442 Aug 14 2009 libodm11.a
lrwxrwxrwx. 1 oracle oinstall 12 Oct 24 22:38 libodm11.so -> libodmd11.so
-rw-r--r--. 1 oracle oinstall 12331 Aug 14 2009 libodmd11.so
Move the current link:
[oracle@intel-eva1]$ mv libodm11.so libodm11.so.preVxFS
And create a new link to the VRTSodm library:
[oracle@intel-eva1]$ ln -s /opt/VRTSodm/lib64/libodm.so libodm11.so
Check the link is correct:
[oracle@intel-eva1 ]$ ls -l libodm*
-rw-r--r--. 1 oracle oinstall 7442 Aug 14 2009 libodm11.a
lrwxrwxrwx. 1 oracle oinstall 28 Oct 29 00:49 libodm11.so ->
/opt/VRTSodm/lib64/libodm.so
lrwxrwxrwx. 1 oracle oinstall 12 Oct 24 22:38 libodm11.so.preVxFS ->
libodmd11.so
-rw-r--r--. 1 oracle oinstall 12331 Aug 14 2009 libodmd11.so
42
Repeat the same steps in the other node.
Start Oracle again.
Taking a look to the log file /oracle/diag/rdbms/tpcc/tpcc/trace/alert_tpcc.log we can verify
that Oracle is usin the VRTSodm library:
db_name = "tpcc"
open_cursors = 300
diagnostic_dest = "/oracle"
Oracle instance running with ODM: Veritas 6.1.0.000 ODM Library, Version 2.0
Tue Oct 29 00:55:07 2013
PMON started with pid=2, OS id=30440
Oracle redo log configuration
To obtain a better performance for the redo log configuration, it is advisable to modify the default 512
bytes block size to 4096 bytes. In this cluster, five 40G redo log files where added with the following
procedure:
SQL> ALTER DATABASE ADD LOGFILE ‘/tpcclog/log_4k_1’ size 41000M blocksize 4096 ;
And we repeat the same for log_4k_2 until log_4k_5
SQL> alter system switch logfile;
And eview alert_tpcc.log until you see the new log is being used, then force a checkpoint:
SQL> alter system checkpoint local;
Remove the old redo log files. First identify the group number running
SQL> select * from v$logfile;
And remove them:
SQL> alter database drop logfile group 1;
Oracle huge pages configuration
The following tuning has been performed in order to enable hug pages for SGA.
Modify file /etc/sysctl.conf in both servers with:
vm.nr_hugepages=200000
43
Increase memlock soft/hard limits for Oracle user. On both servers edit the file
/etc/security/limits.conf with:
oracle soft memlock 410000000
oracle hard memlock 410000000
Oracle HA and Fast Failover Configuration
Oracle agent configuration in VCS
In order to configure the Oracle agent for HA, it is needed to import its definition file. This can be done
manually, using the Cluster Manager console, or using Veritas Operation Manager (VOM). Here we are
going to present the example using VOM, as it provide a global view for storage and HA and will simplify
solution maintenance.
On the Availability tab, right click on the cluster name and select “Import Type Definition”
In the Import Type Definition window type /etc/VRTSagents/ha/conf/Oracle/OracleTypes.cf:
44
One Oracle Service Group will be created to have all the resources needed for the tpcc instance to be
run. That will include IP address, NIC, Oracle database and Oracle listener. Another service group to hold
the volumes and mount points that will be running in the two nodes at the same time will also be
created. Finally there will be dependencies among all the service groups in the cluster. These steps are
explained below.
Remove previous HA configuration for the mount points
In a previous step, a CFS mount point for /tpcclog and /tpccdata were created. This was done using
the CLI in order to have the storage available to create the first instance. Those mount points are going
to be added now into the new service group that is going to be created, so we are going to remove them
from the configuration in order to add them to the proper service group.
[root@intel-eva1 ~]# cfsmntadm delete /tpcclog
[root@intel-eva1 ~]# cfsmntadm delete /tpccdata
Please note that these steps are only removing the mount points from the HA configuration. The mount
points and file systems are still available on the system. Here we are only going to configure those under
the other service group with the other resources.
Service Groups and Resources for Oracle HA
Next step will be creating two service groups where we the resources needed for Oracle will be added.
The first one will be a parallel service group containing the mount points that had been previously
removed. The second will hold Oracle resources.
45
Service Group tpcc_data
Right click on Service Group and select Create Service Group. This one will be parallel and will be named
tpcc_data:
Add the two systems available to the service group and click finish. This is a parallel service group, as the
mount points will be available in both systems at the same time. They will be a base framework for a
Fast Failover configuration, making the data continuously available to the two nodes of the cluster.
Now both /tpccdata and /tpcclog will be added into the tpcc_data service group. Right click on the
tpcc_data service group and select Add/Modify Resources
On the following window, add these four resources to the tpcc_data service group:
46
Then click on the dotted icon and add the properties for each of the resources:
For the Clustered Volume ones:
Name Type CVMDiskGroup CVMVolume CVMActivation
tpccdata_vol CVMvolDg tpccdata01 tpccdata sw
tpcclog_vol CVMvolDg tpcclog01 tpcclog sw
And for the CFS mount points:
Name Type Block Device Mount Point
MountOpt
tpccdata_mnt CFSMount /dev/vx/dsk/tpccdata01/tpccdata /tpccdata Crw
tpcclog_mntl CFSMount /dev/vx/dsk/tpcclog01/tpcclog /tpcclog Crw
Then click on Enabled and remove the Critical check for now to avoid any error until you check that the
entire configuration is ok. Then click on Finish:
There is a dependency between those resources. The mount points are the parents of the volumes. Click
next and add these dependencies:
47
This will be the final dependency view:
To make sure that the service group is automatically brought online upon system restart, right click on
the service group and select Properties. Click on the Attributes tab and then right click on the
AutoStartList to edit it.
48
Add the two nodes to the list:
Service Group tpcc_instance
Next step will be to configure a specific service group for the tpcc Oracle instance. Follow the same
instructions taken to create tpcc_data service group, but this time select a Failover service group, given
that the tpcc instance will be running only in a node at a time:
Create the following resources for this service group:
49
And set the properties for each of them:
Listener:
- Home: /oracle/product/11.2.0/dbhome_1
- Owner: oracle
NIC:
- Device: eth0
Oracle:
- Home: /oracle/product/11.2.0/dbhome_1
- Owner: oracle
- Sid: tpcc
IP:
- Address: 10.182.100.138
- Device: eth0
- Netmask: 255.255.248.0
And create the following dependencies between those resources:
50
As in the previous service group, modify the AutoStartList to include both nodes of the cluster.
This will be final hierarchy:
So far, two service groups have been created. One parallel for data that will take care of the volumes
and the mount points, and will be active in the two nodes at the same time, and other one has the
resources needed to bring the database up and monitor it. The second service group needs to have the
first one available, so there is a service group dependency between those two. To create this
dependency, right click on the tpcc_instance service group, select Edit and Link:
Select tpcc_instance as the partent group, tpcc_data as the child, and Online Local Firm:
51
The tpcc_data service group also depends on cvm service group, so create their dependency:
This will be final service group dependency for the cluster:
52
Once the configuration has been tested and it has been verified that the Oracle instance can failover
between nodes, set the resources as critical so the cluster will take the proper actions in case of failure.
Fast Failover Setting
SFCFSHA is able to provide a Fast Failover framework for Oracle based in two technologies. First, the
usage of Cluster Volume Manager and Cluster File System provides a single name space that is currently
accessible from all the nodes of the cluster. This means that there is no delay by having to make the data
accessible to any node, as the file systems are currently mounted. This is the goal of the tpcc_data
service group.
Second is the utilization of Asynchronous Monitoring Framework for the VCS agents. This framework
allows detection of any application failure in real time. Previous released needed manual configuration,
but that is no longer the case for SFCFS 6.1 version. For reference, we are going to include where in the
configuration this property is being specified.
The file /etc/sysconfig/amf determines that AMF will start and stop:
cat /etc/sysconfig/amf
AMF_START=1
AMF_STOP=1
The following agents used in this configuration already support IMF:
53
- CVMCluster
- CVMVxconfigd
- CVMVolDg
- CFSMount
- CFSfsckd
- Coordination Point Agent
- Oracle
- Netlsnr
This command can be used to check that AMF module is loaded:
[root@intel-eva1 ~]# /etc/init.d/amf status
AMF: Module loaded and configured
To verify IMF properties, right click on each agent and select Properties. Look for the IMF attribute. This
is the example for Oracle:
Mode 3 means it is enabled for both Online and Offline operations. In addition of asynchronous
monitoring, every 5 minutes (MonitorFreq) a normal monitor is triggered. The Netlsnr agent has the
same properties by default: