RedHat 4 working cluster (on VMware) config

I have been struggling with RH Cluster 4 with VMware fencing device. This was also a good experiance with qdiskd, the Disk Quorum directive and utilization. I have several conclusions out of this experience. First, the configuration, as is:

<?xml version=”1.0″?>
<cluster alias=”alpha_cluster” config_version=”17″ name=”alpha_cluster”>
<quorumd interval=”1″ label=”Qdisk1″ min_score=”3″ tko=”10″ votes=”3″>
<heuristic interval=”2″ program=”ping vm-server -c1 -t1″ score=”10″/>
<fence_daemon post_fail_delay=”0″ post_join_delay=”3″/>
<clusternode name=”clusnode1″ nodeid=”1″ votes=”1″>
<multicast addr=”″ interface=”eth0″/>
<method name=”1″>
<device name=”vmware”
<clusternode name=”clusnode2″ nodeid=”2″ votes=”1″>
<multicast addr=”″ interface=”eth0″/>
<method name=”1″>
<device name=”vmware”
<multicast addr=”″/>
<fencedevice agent=”fence_vmware” ipaddr=”vm-server” login=”cluster”
name=”vmware” passwd=”clusterpwd”/>
<failoverdomain name=”cluster_domain” ordered=”1″ restricted=”1″>
<failoverdomainnode name=”clusnode1″ priority=”1″/>
<failoverdomainnode name=”clusnode2″ priority=”1″/>
<fs device=”/dev/sdb2″ force_fsck=”1″ force_unmount=”1″ fsid=”62307″
fstype=”ext3″ mountpoint=”/mnt/sdb1″ name=”data”
options=”” self_fence=”1″/>
<ip address=”″ monitor_link=”1″/>
<script file=”/usr/local/” name=”My_Script”/>
<service autostart=”1″ domain=”cluster_domain” name=”Test_srv”>
<fs ref=”data”>
<ip ref=”″>
<script ref=”My_Script”/>

Several notes:

  1. You should run mkqdisk -c /dev/sdb1 -l Qdisk1 (or whatever device is for your quorum disk)
  2. qdiskd should be added to the chkconfig db (chkconfig –add qdiskd)
  3. qdiskd order should be changed from 22 to 20, so it precedes cman
  4. Changes to fence_vmware according to the past directives, including Yoni’s comment for RH4
  5. Changes in structure. Instead of using two fence devices, I use only one fence device but with different “ports”. A port is translated to “-n” in fence_vmware, just as it is being translated to “-n” in fence_brocade – fenced translates it
  6. lock_gulmd should be turned off using chkconfig

A little about command-line version change:

When you update the cluster.conf file, it is not enough to update the ccsd using “ccs_tool update /etc/cluster/cluster.conf“, but you also need to understand that cman is still on the older version. Using “cman_tool version -r <new version>“, you can force it to allow other nodes to join after a reboot, when they’re using the latest config version. If you fail to do it, other nodes might be rejected.

I will add additional information as I move along.

Tags: , , , , , ,

Leave a Reply