Creating a Red Hat Cluster

Creating a Red Hat Cluster: Part 4In this article we will create a GFS filesystem that will allows us to share data between nodes. In the next and last article we’ll finalise the cluster by completing our ftp and web service so they really work. We will also show you how to manually move service from one server to another. So we still have some work to do, so let’s start right away.

Adding a SAN disk to our serversThe Linux operating system is installed on the internal disks for each of our server. We will now add a SAN disk that will be visible be each of our server. I assume here that your SAN and your Brocade switch are configure accordingly. Explaining how to set up the SAN and the Brocade switch is not in the scope of this article. But I think that you get the idea that the new disk must be visible by every node in our cluster. In the example below we already have a SAN disk (sda) with one partition (sda1) on it. Adding a disk to the server, can be done (live) without any interruption of service, if you follow the steps below. I would suggest you practice on a test server, to become familiar with the procedure.

Before we add a disk, let’s see what are the visible disks on the system, by looking at the /proc/partitions file. We can see that we already have a disk (sda) with one partition on it. So the new disk that we’re going to add, should be seen as “sdb”.

root@gollum~# grep sd /proc/partitions8 0 104857920 sda8 1 104856223 sda1

Let’s rescan the SCSI bus by typing the command below. This command must be run on each of the server within the cluster. Here, we have only one HBA (Host Base Adapter) card connected to the SAN on each server. If you have a second HBA, you need to run the same command for the second HBA, but replace the “host0″ by “host1″.

root@gollum~# echo “- – -” > /sys/class/scsi_host/host0/scanroot@gandalf~# echo “- – -” > /sys/class/scsi_host/host0/scanroot@bilbo~# echo “- – -” > /sys/class/scsi_host/host0/scan

Let’s see if we have some new disk(s) that were detected (sdb) (check each servers)

root@gollum~# grep sd /proc/partitions8 0 104857920 sda8 1 104856223 sda18 16 15728640 sdb

Let’s create a LVM partition on our new disk (sdb) by running the “fdisk” command.

# fdisk /dev/sdb

Command (m for help): p

Disk /dev/sdh: 107.3 GB, 107374510080 bytes255 heads, 63 sectors/track, 13054 cylindersUnits = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System

Command (m for help): n (n=new partition)Command actione extendedp primary partition (1-4) p (p=primary partition)

Partition number (1-4): 1 (first partition =1)First cylinder (1-13054, default 1): 1 (Start at the beginning of disk)Last cylinder or +size or +sizeM or +sizeK (1-13054, default 13054): 13054 (End of the Disk)Command (m for help): p (p=Print partition information)

Disk /dev/sdh: 107.3 GB, 107374510080 bytes255 heads, 63 sectors/track, 13054 cylindersUnits = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System/dev/sdb1 1 13054 104856223+ 83 Linux

Command (m for help): t (t=type of partition)Selected partition 1 (Change type of partition 1)Hex code (type L to list codes): 8e (8e=LVM partition – Type L to list partition code)Changed system type of partition 1 to 8e (Linux LVM)Command (m for help): p (p=print partition information)

Disk /dev/sdh: 107.3 GB, 107374510080 bytes255 heads, 63 sectors/track, 13054 cylindersUnits = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System/dev/sdb1 1 13054 104856223+ 8e Linux LVM

Command (m for help): w (w=write partition to disk)

The partition table has been altered!Calling ioctl() to re-read partition table.Syncing disks.

If we look again at our /proc/partition, we should see our new disk and partition are being seen by this server.

root@gollum~# grep sd /proc/partitions8 0 104857920 sda8 1 104856223 sda18 16 15728640 sdb8 16 15727658 sdb1

Now we need to make sure that the new disk and partition are seen by every servers within the cluster. We now need to go on every servers in the cluster and run the command “partprobe” (Partition probe). After running the command, you should check like we did on “gollum”, if all the disks and partitions are seen by each servers.

root@gollum~# partproberoot@gandalf~# partproberoot@bilbo~# partprobe

Creating our physical volume

Now that we know that the disk is seen by every node, let’s create the physical volume on one of the server and then check on all others servers in the cluster, the physical should be seen on all the servers. The command to create a physical volume is “pvcreate”, so what we are really doing here, is creating a physical volume of the partition (/dev/sdb1) we created earlier.

# pvcreate /dev/sdb1Physical volume “/dev/sdb1″ successfully created

Let’s run a pvscan on every node, to validate that every node can actually see the new disk.

# pvscanPV /dev/sda1 VG datavg lvm2 [100.00 GB / 22.02 GB free]PV /dev/sdb1 lvm2 [100.00 GB]

Create our clustered volume group

We will now create a new volume group named “sharevg” and we will assign the physical volume “/dev/sdb1″ as part of that group. If we ever ran out of disk space within “sharevg”, we could add another physical volume to the volume group and continue to work without any service disruption. This is a real advantage when working in production environment.

# vgcreate sharevg /dev/sdb1Clustered volume group “sharevg” successfully created

Display the “sharevg” volume group properties

We can display the volume group properties by issueing the “vgdisplay” command. We can see that our volume group is “Clustered”, so it is cluster aware. This will allow later on, to create “LV” (Logical volume/Filesystem) on one server and have the cluster software automatically advise the cluster member that a new logical volume (filesystem) is available.

root@gollum~# vgdisplay sharevg

--- Volume group ---

VG Name sharevg

System ID

Format lvm2

Metadata Areas 1

Metadata Sequence No 25

VG Access read/write

VG Status resizable

Clustered yes

Shared no

MAX LV 0

Cur LV 0

Open LV 0

Max PV 0

Cur PV 1

Act PV 1

VG Size 100.00 GB

PE Size 4.00 MB

Total PE 24999

Alloc PE / Size 1 / 2.10 MB

Free PE / Size 24980 / 98.00 GB

VG UUID V8Ag76-vdW2-NAk4-JjOo-or3l-GuPz-x5LEKP

Create a logical volume of 1024MB named “cadmin” of the “sharevg” volume group

We will create a logical volume named “cadminlv” (Cluster Admin), in our sharevg volume group. The command below is asking to create a logical volume of 1024MB, name “cadminlv” in the volume group “sharevg”. This command can be done one server and the logical volume will be seen by every member of the cluster.

#/usr/sbin/lvcreate -L1024M -n cadminlv sharevgLogical volume “cadminlv” created

The “lvs” command allows you to display a list of all your logical volumes. Since this is currently the only one on the volume group “sharevg”, we filter the list (with the grep command) to only display the logical volume on “sharevg” volume group.Let’s check if it seen by all nodes.

root@gandalf~# lvs | grep sharevgcadminlv sharevg -wi-a- 1024.00M

root@gollum~# lvs | grep sharevgcadminlv sharevg -wi-a- 1024.00M

root@bilbo~# lvs | grep sharevgcadminlv sharevg -wi-a- 1024.00M

Creating the /cadmin GFS FilesystemFinally, we are going to create a GFS filesystem within that logical volume “cadminlv” we’ve just created. But first we need to create our GFS filesystem mount point on every nodes. We need to do that, because we want this filesystem to be mounted on every node and be available for the 3 nodes.

root@gandalf~# mkdir /cadmin

root@bilbo~# mkdir /cadmin

root@gollum~# mkdir /cadmin

We have choosen to have the GFS filesystem “/cadmin” to be mounted on all servers at all time. We could have include it as part of our service, so it would be mounted only when a service is started. But we found out, that the process of unmounting and mounting a couple of GFS take time and this time adds up to the time it take to move a service from one server to another. In production we have 5 servers in a cluster for more that 2 years now, we have around 30 GFS mounted at all time on all the five servers and we had very little problem. The only thing you have to be careful about is the number of journals that you assign to each GFS. One journal is needed for each concurrent mount in the cluster, in our case we have at least 5 journals for each of our GFS filesystem (more on that below).Create the GFS on the LVM cadmin created previously. This need to be done only on one node, the creation is done once and all the nodes are made aware of the new GFS by the cluster daemon.

The command we use to create a GFS filesystem is “gfs_mkfs”. We need to use a couple of options and I will explain them all.

First, the “-O“ prevents “gfs_mkfs” from asking for confirmation before creating the filesystem.

The option “-p lock_dlm“, indicate the name of the locking protocol to use. The locking protocol should be “lock_dlm” for a clustered file system.

The “-t our_cluster:cadminlv” It’s the cluster-name, followed by “:” and the logical volume name. The cluster name must match the one you have defined in your cluster configuration file (in our case “our_cluster”), only members of this cluster are permitted to use this file system. The filesystem name (cadminlv) is a unique file system name used to distinguish this GFS file system from others created (1 to 16 characters).

The “-j 4” is the number of journals for gfs_mkfs to create. You need at least one journal per machine that will mount the filesystem. This number should have been 3, but I always add one more, in case a add a member in the cluster. This number is important, if I had put 3 and I added a node within the cluster and I wanted the 4 nodes to mount simultaneously this filesystem, I would need to make sure that the filesystem have 4 journals, because the GFS wound not mount. You can always add a journal to an existing GFS filesystem with the “gfs_jadd” command. Each journal reserve 128 MB in the filesystem, so you need take into consideration. Let look at our example, we want all our nodes to mount the “/cadmin” GFS filesystem, we created an logical volume of 1024M and on it we created a GFS, we reserved 4 journals (4*128=512MB) , so will have only around 500 MB available for data out of the 1024MB we allocated to our logical volume.

The last parameter “/dev/sharevg/cadmlv” is the name of the logical volume we created previously.

# /sbin/gfs_mkfs -O -p lock_dlm -t our_cluster:cadmlv -j 4 /dev/sharevg/cadmlvDevice: /dev/sharevg/cadmlvBlocksize: 4096Filesystem Size: 98260Journals: 4Resource Groups: 8Locking Protocol: lock_dlmLock Table: our_cluster:cadminlvSyncing…All Done

We are now able to mount our GFS filesystem on all the servers, by using the command below on all the servers,

# mount -t gfs /dev/sharevg/cadminlv /cadmin

We want that filesystem to be mounted every time a server boot, so don’t forget to add your filesystem to /etc/fstab, so it will mount after the next reboot and don’t forget to change the owner and protection of the filesystem.

# echo “/dev/sharevg/cadminlv /cadmin gfs defaults 0 0″ >> /etc/fstab

The filesystem should be available on all our nodes.

# df -h /cadminFilesystem Size Used Avail Use% Mounted on/dev/mapper/sharevg-cadminlv 510 M 1.5M 508M 1% /cadmin

So we’ve just created our first GFS filesystem and it is mounted on all our nodes in the cluster.

In our last article we will finalise our cluster, by creating the needed scripts for our ftp/web services to start and to move from server to server. We will add these scripts to our cluster configuration and we will show you how to move service from one server to another using the command line and GUI. So stay tune, for this last article on how to build a Red Hat cluster.

Creating a Red Hat Cluster: Part 5Welcome back to LINternUX, for our last article of this series on how to build a working Red Hat cluster. So far we have a working cluster, but it only move the IP from server to server. In this article, we will put in place everything so that we have an FTP and a web service that will be fully redundant within our cluster. In our previous article, we have create a GFS filesystem under the mount point “/cadmin”, this is where we will put our scripts, configuration files and log used for our cluster. The content of the “/cadmin” filesystem can be downloaded here, it include all the directories structure and scripts used in our cluster articles. After this article, you will have a fully configured cluster, running an ftp and a web service. We will have a lot to do, so let’s begin.

FTP prerequisiteWe need to make sure that the ftp server “vsftpd” is installed on every server in our cluster. You can check if it is installed by typing the following command;

root@gandalf:~# rpm -q vsftpdvsftpd-2.0.5-16.el5_5.1root@gandalf:~#

If is not installed, we need to run the following command to install it on the servers where it’s not installed;

root@bilbo:~# yum install vsftpd

We must make sure the vsftpd is not started and doesn’t start upon reboot. To do so use the following commands on all servers;

root@bilbo:~# service vsftpd stopShutting down vsftpd: [FAILED]root@bilbo:~# chkconfig vsftpd off

Script to stop/start/status our FTP serviceNow we need to create a script for each of our services (ftp and web) that the cluster software will use to stop and start the appropriate service and add it to our cluster configuration. We’ll put these scripts if our /cadmin GFS filesystem, so it’s accessible by our 3 servers. We will start by creating the script for the ftp service. The script used by the Red Hat Cluster Suite, receive one parameter when called by the cluster software. This parameter can be “stop”, “start” and “status”. You can download a copy of the script and the vsftpd configuration file if you want. But remember that if you want to use them as is, you must put them in the /cadmin filesystem. The “srv_ftp.sh” script will go in a subdirectory name “/cadmin/srv” and the configuration file “srv_ftp.conf” must go in “/cadmin/cfg” directory. But nothing beat an example; let’s build the one for our FTP service.

#! /bin/bash

# ---------------------------------------------------------------------------------

# Script to stop/start and give a status of ftp service in the cluster.

# This script is build to receive 3 parameters.

# - start : Executed by cluster to start the application(s) or service(s)

# - stop : Executed by cluster to stop the application(s) or service(s)

http://linternux.com/download/cluster/srv_ftp.conf

http://linternux.com/download/cluster/srv_ftp.sh

http://linternux.com/download/cluster/srv_ftp.conf

http://linternux.com/download/cluster/srv_ftp.sh

http://linternux.com/download/cluster/cadmin.tar

# - status: Executed by cluster every 30 seconds to check service status.

# -------------------------------------------------------------------------

# Author : Jacques Duplessis - April 2011

# -------------------------------------------------------------------------

#set -x

CDIR="/cadmin" ; export CDIR # Root directory for Services

CSVC="$CDIR/srv" ; export CSVC # Service Scripts Directory

CCFG="$CDIR/cfg" ; export CCFG # Service Config. Directory

INST="srv_ftp" ; export INST # Service Instance Name

LOG="$CDIR/log/${INST}.log" ; export LOG # Service Log file name

HOSTNAME=`hostname -a` ; export HOSTNAME # HostName

VSFTPD="/usr/sbin/vsftpd" ; export VSFTPD # Service Program name

FCFG="${CCFG}/${INST}.conf" ; export FCFG # Service Config. file name

RC=0 ; export RC # Service Return Code

DASH="---------------------"; export DASH # Dash Line

# Where the Action Start

# -------------------------------------------------------------------------

case "$1" in

start) echo -e "\n${DASH}" >> $LOG 2>&1

echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG 2>&1

echo -e "${VSFTPD} ${FCFG}" >> $LOG 2>&1

${VSFTPD} ${FCFG} >> $LOG 2>&1

RC=$?

FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1`

echo "Service $INST started on $HOSTNAME - PID=${FPID} RC=$RC">> $LOG

echo "${DASH}" >> $LOG 2>&1

;;

stop ) echo -e "\n${DASH}" >> $LOG 2>&1

echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG

ps -ef | grep ${FCFG}| grep -v grep >> $LOG 2>&1


echo -e "Killing PID ${FPID}" >> $LOG 2>&1

kill $FPID >> $LOG 2>&1

echo -e "Service $INST is stopped ..." >> $LOG 2>&1

RC=0


;;

status) COUNT=`ps -ef | grep ${FCFG}| grep -v grep | wc -l`


echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1

if [ $COUNT -gt 0 ]

then echo " - PID=${FPID} - OK" >> $LOG 2>&1

RC=0

else echo " - NOT RUNNING" >> $LOG 2>&1

ps -ef | grep -i ${FCFG} | grep -v grep >> $LOG 2>&1

RC=1

fi

;;

esac

exit $RC

This script is placed in the directory “/cadmin/svc” and name “svc_ftp.sh”. Now, let’s add this script to our ftp service, run the “system-config-cluster” command to start the cluster configuration GUI.

Add ftp script to our ftp cluster serviceNow, let’s add this script to our ftp service, run the “system-config-cluster” command.

root@gandalf:~# system-config-cluster &

Click on “Resources” on the left side and then on the “Create a Resource”‘ button at the bottom right of the screen. This will allow us insert our ftp service script into the cluster configuration.

Select “Script” from the “Resource Type” list and then enter the name of our ressource “srv_ftp” and then specify the name of the script the service will use, with it’s full path. Here, like I said we decided to place it in our “/cadmin” GFS filesystem so it is seen by every node in the cluster.

http://linternux.com/wordpress/wp-content/uploads/2011/04/Add_ftp_service_01.png


Now we need to edit our “srv_ftp” service to add the resource we just created.

Select the “srv_ftp” service at the bottom left of the screen and then press the “Edit Service Properties” button.

Click on the “Add a Shared Resource to this service” button. This will bring up the screen below, where we select the “srv_ftp” script that we want to add our service.

After adding our script to the resource, press the “Close” button.

We are now ready to push our new configuration to the member of our cluster, press the “Send to Cluster” button to do so.






Web site prerequisiteMake sure that the “httpd” and the “php” package is installed on every server in our cluster. You can check if it is installed by typing the following command;

root@gandalf # rpm -q httpd phphttpd-2.2.3-45.el5php-5.1.6-27.el5_5.3root@gandalf #

#

If is not installed, we need to run the following command to install them on the servers where it’s not installed;

root@bilbo:~# yum install httpd php

We must make sure the “httpd” is not started and doesn’t start upon reboot. To do so use the following commands on all servers;

root@bilbo:~# service httpd stopShutting down httpd: [FAILED]root@bilbo:~# chkconfig httpd off

Script to stop/start/status our Web serviceWe have simplified the configuration of our web site to the minimum. This was done intentionally; we wanted to demonstrate the cluster functionally and not the “httpd” possibilities. But our web site will be functional and redundant. As with the ftp script, the function of our web server script is very similar. You can download this script and the httpd configuration file if you want, but remember that if you want to use them as is, you must put them in the /cadmin filesystem. The “srv_www.sh” script will go in a subdirectory name “/cadmin/srv” and the configuration file “srv_www.conf” must go in “/cadmin/cfg” directory.

#! /bin/bash

# -------------------------------------------------------------------------

# Script to stop/start and give a status of our web service in the cluster.

# This script is build to receive 3 parameters.

# - start : Executed by cluster to start the application(s) or service(s)

# - stop : Executed by cluster to stop the application(s) or service(s)

# - status: Executed by cluster every 30 seconds to check service status.

# ------------------------------------------------------------------------

# Author : Jacques Duplessis - April 2011

# -------------------------------------------------------------------------

#set -x

CDIR="/cadmin" ; export CDIR # Root directory for Services

CSVC="$CDIR/srv" ; export CSVC # Service Scripts Directory

CCFG="$CDIR/cfg" ; export CCFG # Service Config. Directory

INST="srv_www" ; export INST # Service Instance Name

LOG="$CDIR/log/${INST}.log" ; export LOG # Service Log file name

HOSTNAME=`hostname -a` ; export HOSTNAME # HostName

HTTPD="/usr/sbin/httpd" ; export HTTPD # Service Program name

http://linternux.com/download/cluster/srv_www.conf

http://linternux.com/download/cluster/srv_www.sh

http://linternux.com/download/cluster/srv_www.conf

http://linternux.com/download/cluster/srv_www.sh

HCFG="${CCFG}/${INST}.conf" ; export HCFG # Service Config. file name

RC=0 ; export RC # Service Return Code

DASH="---------------------"; export DASH # Dash Line

# Where the Action Start

# -------------------------------------------------------------------------

case "$1" in

start) echo -e "\n${DASH}" >> $LOG 2>&1

echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG 2>&1

echo -e "${HTTPD} ${HCFG}" >> $LOG 2>&1

${HTTPD} -f ${HCFG} >> $LOG 2>&1

RC=$?

HPID=`cat ${CCFG}/${INST}.pid`

echo "Service $INST started on $HOSTNAME - PID=${HPID} RC=$RC">> $LOG


;;

stop ) echo -e "\n${DASH}" >> $LOG 2>&1

echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG


echo -e "Killing PID ${HPID}" >> $LOG 2>&1

kill $HPID > /dev/null 2>&1

echo -e "Service $INST is stopped ..." >> $LOG 2>&1

RC=0


;;

status) COUNT=`ps -ef | grep ${HCFG}| grep -v grep | wc -l`


echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1

if [ $COUNT -gt 0 ]

then echo " - PID=${HPID} - OK" >> $LOG 2>&1

RC=0

else echo " - NOT RUNNING" >> $LOG 2>&1

ps -ef | grep -i ${HCFG} | grep -v grep >> $LOG 2>&1

RC=1

fi

;;

esac

exit $RC

Updating our cluster Configuration

To add our web service, please follow the same sequence as we did when we inserted our ftp service into the cluster configuration. You only need to replace “srv_ftp.sh” by “srv_www.sh” and the script path will be the same, we have decide to place our scripts into the directory “/cadmin/srv”. Once we have push the new configuration to all servers in the cluster, we should now have a working cluster.The web site define in the configuration have its “Root Directory” set to “/cadmin/www/html” it contains only one file that will display the name of the it is running on. The will help us testing our cluster configuration.

I you wish to use the cluster configuration, scripts and configuration files we have used in this series of articles, I would encourage you to download the “cadmin.tar” file. The file is the actual content of the “/cadmin” directory used throught out this article. To use it, download the “cadmin.tar” file then copy it to your “/cadmin” directory and enter the command “tar -xvf ./cadmin.tar”. This will explode the tar file and then you will have the working envirionnment I used in this article.

Testing our ftp serviceSo here we are (finally you would say … hum me too), we have now a fully working cluster. So if we issue the “clustat” command this is what we should see.

root@gollum:/# clustat

Cluster Status for our_cluster @ Sat Apr 16 11:37:25 2011

Member Status: Quorate

Member Name ID Status

------ ---- ---- ------

hbbilbo.maison.ca 1 Online, rgmanager

hbgandalf.maison.ca 2 Online, rgmanager

hbgollum.maison.ca 3 Online, Local, rgmanager

Service Name Owner (Last) State

------- ---- ----- ------ -----

service:srv_ftp hbgollum.maison.ca started

service:srv_www hbbilbo.maison.ca started

root@gollum:/#

From the information above, we can see that all our cluster member status are online and that the resource manager is running on all of them. The resource manager is important, it is responsable for moving service around when needed. Our service “srv_ftp” is started (running) on the “hbgollum” server and “srv_www” is running on the “hbbilbo” like we decided at the beginning (remember ?

root@gollum:/# ip addr show | grep 192

inet 192.168.1.104/24 brd 192.168.1.255 scope global eth0

inet 192.168.1.204/24 scope global secondary eth0

root@gollum:/#

root@gollum:/# ps -ef | grep vsftpd | grep -v grep

root 7858 1 0 10:05 ? 00:00:00 /usr/sbin/vsftpd /cadmin/cfg/srv_ftp.conf



root@gollum:/#

The command “ip addr show | grep 192″ is confirming that the virtual IP is defined on “hbgollum” server and if we check if the ftp process is also running, we can see that it is. So let’s try to do an FTP to our virtual IP that we have name “ftp.maison.ca” (192.168.1.204). We will try it from the “gandalf” server and we see that it is working.

root@gandalf:/# ftp ftp.maison.ca

Connected to ftp.maison.ca.

220 ftp.maison.ca

530 Please login with USER and PASS.


KERBEROS_V4 rejected as an authentication type

Name (ftp.maison.ca:root):

Now let’s move the ftp service from “hbgollum” to “hbbilbo”, to see if the ftp service will continue to work. To move the service we will use the “clusvcadm” command, we need to specify the service name we need to relocate (-r) and the server (-m for machine) we wish to move it. You can issue the “clusvccmd” command on any of the server within our cluster. So enter the following command to move our service to “hbbilbo” ;

root@gandalf:/# clusvcadm -r srv_ftp -m hbbilbo

'hbbilbo' not in membership list

Closest match: 'hbbilbo.maison.ca'

Trying to relocate service:srv_ftp to hbbilbo.maison.ca...Success

service:srv_ftp is now running on hbbilbo.maison.ca

root@gandalf:/#

Notice that just after pressing [Enter], we got “‘hbbilbo’ nit in membership list”, this is because we did not mention the domain name ‘maison.ca’, but it managed to assume that we were refering to “hbbilbo.maison.ca”. So our command succeeded, so let’s see if everything went like it should have.

Fisrt, let’s execute the clustat command to see in the “srv_ftp” service is now running on ‘hbbilbo’.

root@gandalf:/# clustat




------ ---- ---- ------


hbgandalf.maison.ca 2 Online, Local, rgmanager

hbgollum.maison.ca 3 Online, rgmanager


------- ---- ----- ------ -----

service:srv_ftp hbbilbo.maison.ca started


root@gandalf:/#

We can see that our ftp service is now running on “hbbilbo”, let’s see if it reallly is. If we check if the 192.168.1.204 (ftp.maison,ca) is now defined on “hbbilbo’ we can see that it is. The FTP process is also running now on the server.

root@bilbo:~# ip addr show | grep 192




root@bilbo:~# ps -ef | grep vsftpd | grep -v grep

root 8616 1 0 12:05 ? 00:00:00 /usr/sbin/vsftpd /cadmin/cfg/srv_ftp.conf

root@bilbo:~#

But what happen on ‘hbgollum’, the IP 192.168.1.204 should not be there anymore and the FTP process should not be running anymore. So that’s what happen the IP is gone and the ftp process is no longer running. So far so good, the moves to ‘hbbilbo’ server have worked.

root@gollum:/etc/profile.d# ip addr show | grep 192


root@gollum:/etc/profile.d# ps -ef | grep vsftpd | grep -v grep

root@gollum:/etc/profile.d#

The last test if to try to do an ftp to ftp.maison.ca and see if it responds.

root@gandalf:/# ftp ftp.maison.ca

Connected to ftp.maison.ca.

220 ftp.maison.ca



KERBEROS_V4 rejected as an authentication type

Name (ftp.maison.ca:root):

Great evrything worked !! Let’s move the ftp process back to ‘hbgollum’ before testing our web site. Open another terminal window and enter the “clustat -i 2″ command and see watch the status change from ‘started/stopping/starting/started’ while the move is going on. Check your /var/log/message and familiar yourself with the line recorded when the move happen.

root@gandalf:/# clusvcadm -r srv_ftp -m hbgollum

'hbgollum' not in membership list

Closest match: 'hbgollum.maison.ca'

Trying to relocate service:srv_ftp to hbgollum.maison.ca...Success

service:srv_ftp is now running on hbgollum.maison.ca

root@gandalf:/#

One of the test we should make, is to unplug the network cable or poweroff “hbgollum” and see if the service move to the next server in the failover domain (It will). So we have now completed and tested our ftp service. It as been a long road but it woth it, no ?

Testing our web serviceYou know now, how to move service from server to another, Let’s do the same test with our web server service. The web site is actually just one simple page. It just display the name of the server that it is running on, this simplify our testing.

If we issue the “clustat” command we should have the following picture;

# clustat




------ ---- ---- ------


hbgandalf.maison.ca 2 Online, Local, rgmanager



------- ---- ----- ------ -----



Let’s see if it is working, open your browser type this URL “http://www.maison.ca”, you should have a the following;

Now, let’s move the wev site to “gandalf” server, type the following command;

root@gollum:/cadmin/cfg# clusvcadm -r srv_www -m hbgandalf

'hbgandalf' not in membership list

Closest match: 'hbgandalf.maison.ca'

Trying to relocate service:srv_www to hbgandalf.maison.ca...Success

service:srv_www is now running on hbgandalf.maison.ca

root@gollum:/cadmin/cfg#

http://linternux.com/wordpress/wp-content/uploads/2011/04/www_bilbo.png

root@gollum:/cadmin/cfg# clustat




------ ---- ---- ------

hbbilbo.maison.ca 1 Online, Local, rgmanager




------- ---- ----- ------ -----


service:srv_www hbgandalf.maison.ca started

We can see that the web site is now running on “gandalf” server.

Disabling and Enabling ServicesThere may come a time, when you need to stop a service completely. We will demonstrate how to acheive that, first let’s display the status of our cluster

root@bilbo:~# clustat




------ ---- ---- ------





------- ---- ----- ------ -----


http://linternux.com/wordpress/wp-content/uploads/2011/04/www_gandalf.png

service:srv_www hbgandalf.maison.ca started

We are going to disable the “srv_www” service, to do so enter the following command;

root@bilbo:~# clusvcadm -d srv_www

Local machine disabling service:srv_www...Success

The clustat command show us that the service is now disable.





------ ---- ---- ------





------- ---- ----- ------ -----


service:srv_www (hbgandalf.maison.ca) disabled

We will now enable the service, but this time we will enable it on another server than “hbgandalf”. This command enables the “srv_www” service on the server “hbbilbo”.

root@bilbo:~# clusvcadm -e srv_www -m hbbilbo

'hbbilbo' not in membership list

Closest match: 'hbbilbo.maison.ca'

Member hbbilbo.maison.ca trying to enable service:srv_www...Success

service:srv_www is now running on hbbilbo.maison.ca

root@bilbo:~#

We can see the it is now running on “hbbilbo”.





------ ---- ---- ------





------- ---- ----- ------ -----



root@bilbo:~#

This concludes our implementation of a small cluster. It was intended just to show everyone how the Red Hat Cluster Suite actually work and to give a brief overview how it work. We will now move on to other interesting topic. Don’t know what it will be, but I can assure that it should fit into one article, so I hope you appreciate it and hope to see you soon.

Creating a Red Hat Cluster

Documents

Transcript of Creating a Red Hat Cluster