Single-Node Linux Heartbeat Cluster with DRBD on Centos

Byetzion 23/10/2006

The trick is simple, and many of those who deal with HA cluster get at least once to such a setup – have HA cluster without HA.

Yep. Single node, just to make sure you know how to get this system to play.

I have just completed it with Linux Heartbeat, and wish to share the example of a setup single-node cluster, with DRBD.

First – get the packages.

It took me some time, but following Linux-HA suggested download link (funny enough, it was the last place I’ve searched for it) gave me exactly what I needed. I have downloaded the following RPMS:

heartbeat-2.0.7-1.c4.i386.rpm

heartbeat-ldirectord-2.0.7-1.c4.i386.rpm

heartbeat-pils-2.0.7-1.c4.i386.rpm

heartbeat-stonith-2.0.7-1.c4.i386.rpm

perl-Mail-POP3Client-2.17-1.c4.noarch.rpm

perl-MailTools-1.74-1.c4.noarch.rpm

perl-Net-IMAP-Simple-1.16-1.c4.noarch.rpm

perl-Net-IMAP-Simple-SSL-1.3-1.c4.noarch.rpm

I was required to add up the following RPMS:

perl-IO-Socket-SSL-1.01-1.c4.noarch.rpm

perl-Net-SSLeay-1.25-3.rf.i386.rpm

perl-TimeDate-1.16-1.c4.noarch.rpm

I have added DRBD RPMS, obtained from YUM:

drbd-0.7.21-1.c4.i386.rpm

kernel-module-drbd-2.6.9-42.EL-0.7.21-1.c4.i686.rpm (Note: Make sure the module version fits your kernel!)

As soon as I finished searching for dependent RPMS, I was able to install them all in one go, and so I did.

Configuring DRBD:

DRBD was a tricky setup. It would not accept missing destination node, and would require me to actually lie. My /etc/drbd.conf looks as follows (thanks to the great assistance of linux-ha.org):

resource web {
protocol C;
incon-degr-cmd “echo ‘!DRBD! pri on incon-degr’ | wall ; sleep 60 ; halt -f”; #Replace later with halt -f
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; } # or panic, …
syncer {
group 0;
rate 80M; #1Gb/s network!
}
on p800old {
device /dev/drbd0;
disk /dev/VolGroup00/drbd-src;
address 1.2.3.4:7788; #eth0 network address!
meta-disk /dev/VolGroup00/drbd-meta[0];
}
on node2 {
device /dev/drbd0;
disk /dev/sda1;
address 192.168.99.2:7788; #eth0 network address!
meta-disk /dev/sdb1[0];
}
}

I have had two major problems with this setup:

1. I had no second node, so I left this “default” as the 2nd node. I never did expect to use it.

2. I had no free space (non-partitioned space) on my disk. Lucky enough, I tend to install Centos/RH using the installation defaults unless some special need arises, so using the power of the LVM, I have disabled swap (swapoff -a), decreased its size (lvresize -L -500M /dev/VolGroup00/LogVol01), created two logical volumes for DRBD meta and source (lvcreate -n drbd-meta -L +128M VolGroup00 && lvcreate -n drbd-src -L +300M VolGroup00), reformatted the swap (mkswap /dev/VolGroup00/LogVol01), activated the swap (swapon -a) and formatted /dev/VolGroup00/drbd-src (mke2fs -j /dev/VolGroup00/drbd-src). Thus I have now additional two volumes (the required minimum) and can operate this setup.

Solving the space issue, I had to start DRBD for the first time. Per Linux-HA DRBD Manual, it had to be done by running the following commands:

modprobe drbd

drbdadm up all

drbdadm — –do-what-I-say primary all

This has brought the DRBD up for the first time. Now I had to turn it off, and concentrate on Heartbeat:

drbdadm secondary all

Heartbeat settings were as follow:

/etc/ha.d/ha.cf:

use_logd on #?Or should it be used?
udpport 694
keepalive 1 # 1 second
deadtime 10
initdead 120
bcast eth0
node p800old #`uname -n` name
crm yes
auto_failback off #?Or no
compression bz2
compression_threshold 2

I have also created a relevant /etc/ha.d/haresources, although I’ve never used it (this file has no importance when using “crm=yes” in ha.cf). I did, however, use it as a source for /usr/lib/heartbeat/haresources2cib.py:

p800old IPaddr::1.2.3.10/8/1.255.255.255 drbddisk::web Filesystem::/dev/drbd0::/mnt::ext3 httpd

It is clear that the virtual IP will be 1.2.3.10 in my class A network, and DRBD would have to go up before mounting the storage. After all this, the application would kick in, and would bring up my web page. The application, Apache, was modified beforehand to use the IP 1.2.3.10:80, and to search for DocumentRoot in /mnt

Running /usr/lib/heartbeat/haresources2cib.py on the file (no need to redirect output, as it is already directed to /var/lib/heartbeat/crm/cib.xml), and I was ready to go.

/etc/init.d/heartbeat start (while another terminal is open with tail -f /var/log/messages), and Heartbeat is up. It took it few minutes to kick the resources up, however, I was more than happy to see it all work. Cool.

The logic is quite simple, the idea is very basic, and as long as the system is being managed correctly, there is no reason for it to get to a dangerous state. Moreover, since we’re using DRBD, Split Brain cannot actually endanger the data, so we get compensated for the price we might pay, performance-wise, on a real two-node HA environment following these same guidelines.

I cannot express my gratitude to http://www.linux-ha.org, which is the source of all this (adding up with some common sense). Their documents are more than required to setup a full working HA environment.

bash | Disk Storage | Linux | Scripting/Programming | Virtualization

XenServer create snapshots for all machines

Byetzion 07/08/200911/08/2020

XenServer is a wonderful tool. One of the better parts of it is its powerful scripting language, powered by the ‘xe’ command. In order to capture a mass of snapshots, you can either do it manually from the GUI, or scripted. The script supplied below will include shell functions to capture Quiesce snapshots, and it…

Disk Storage

NetApp internals – how to add SSH keys without C$ nor NFS shares

Byetzion 03/04/2014

This post will describe the process of placing SSH keys using the internal ‘systemshell’ command of NetApp. As always – when doing something which the vendor did not intend you to do, do it very carefully. This data was obtained from NetApp forums, and while I do not have the original post to link (I…

AIX | Clusters

Setting up an AIX HA-CMP High Availability test Cluster

Byetzion 04/07/2006

This post will be divided into this common view part, and (the first in this blog) "click here for more" part. The main reason I’ve created this blog was to document, both for myself and other technical persons, the acts required to perform set tasks. My first idea was to document how to install HACMP…

Clusters | Disk Storage | Linux

HP MSA1000 controller failover

Byetzion 27/03/2007

HP MSA1000 is an entry-level disk storage capable of communicating via different types of interfaces, such as SCSI and FC, and can allow FC failover. This FC failover, however, is controller failover and not path failover. It means that if the primary controller fails entirely, the backup controller will “kick in”. However, if a multi-path…

Clusters | Disk Storage | Linux

Oracle ASM and EMC PowerPath

Byetzion 28/05/2008

Setting up an Oracle ASM disks is rather simple, and the procedure can be easily obtained from here, for example. This is nice and pretty, and works well for most environments. EMC PowerPath creates meta devices which utilize the underlying paths, as mod_scsi sees them in Linux, without hiding them (unlike IBM’s RDAC, for example)….

Disk Storage

IBM DS3400 expand Logical Drive (LUN)

Byetzion 06/05/2009

I have always liked IBM DS series management suite. I have claimed once that your first storage (and with it – your way of thinking about storage abstraction, I assume) is your favorite storage. I have been using the Storage Manager 9 for years now, even before it was 9 (I think that it was…

3 Comments

darkfader says:

23/08/2011 at 12:49 pm

Kudos for that writeup, you turned up #1 for drbd single node and I think I found anything I needed.

And one more thing – single node clusters don’t just rock for testing, they have other big advantages:

– Run all your applications under control of a ressource manager & monitor, so get automatic restart and a standardized notification framework
– If you run all your systems as (even 1-node clusters) then you’ll have no operational differences, which saves a lot on admin errors
and, the best thing:
– If you run your app in a single node cluster you can easily turn it into a HA cluster once needed, without any downtime.
So i.e. once your user decides “oh gosh, this is a critical app” then you’ll just define a second node and add it in.
Tada, immediate failover support.

They won’t pay for 24×7 HA any longer? well, just remove the 2nd node again.

Have fun with your clusters – and go check out Veritas Cluster suite (there’s a 30-day demo) if you wanna see some real cluster power!
(I used heartbeat and VCS – heartbeat is nice as long as things go well, but not quite… telco grade if you get me 🙂
VCS on the other hand is completely cost-prohibitive, but it doesn’t hurt to know both options…

Reply
1. ez-aton says:
  
  23/08/2011 at 10:12 pm
  
  Thanks. And now for some comments:
  1. You are correct. There are other uses for single node cluster. I specified only a single consideration with setting up a single-node cluster, and thanks for you bringing up other issues.
  2. I have had fun with many more clusters. VCS, HACMP, SunCluster, RedHat Cluster, Linux HA, and even MS Cluster (2000&2003). So I could say I am familiar with other clusters. Linux HA would not be my recommendation nowadays. I recommend RedHat Cluster, and I implement both it and VCS. When there are things beyond VCS ability to perform, there is hardly any way to make it do so (and I have seen some very twisted logic going in some of its implementations to override its limitations). It’s easier, on many levels, with RedHat Cluster.
  I will return your advice:
  Have fun with your clusters – go and check out RedHat Cluster (you can get it from your local RHEL installation media) if you wanna see some real cluster power!
  And RHCS is very much telco grade. You are most invited to check it, and pop a question if you want any clarification about it.
  
  Thanks!
  Ez
  
  Reply
darkfader says:

24/08/2011 at 2:00 pm

cool thanks – I had avoided it so far and thought it’s not really worth it.
always interesting to have some assumptions corrected 🙂

Reply

Related posts:

Similar Posts

3 Comments

Leave a Reply Cancel reply