| |

Single-Node Linux Heartbeat Cluster with DRBD on Centos

The trick is simple, and many of those who deal with HA cluster get at least once to such a setup – have HA cluster without HA.

Yep. Single node, just to make sure you know how to get this system to play.

I have just completed it with Linux Heartbeat, and wish to share the example of a setup single-node cluster, with DRBD.

First – get the packages.

It took me some time, but following Linux-HA suggested download link (funny enough, it was the last place I’ve searched for it) gave me exactly what I needed. I have downloaded the following RPMS:

heartbeat-2.0.7-1.c4.i386.rpm

heartbeat-ldirectord-2.0.7-1.c4.i386.rpm

heartbeat-pils-2.0.7-1.c4.i386.rpm

heartbeat-stonith-2.0.7-1.c4.i386.rpm

perl-Mail-POP3Client-2.17-1.c4.noarch.rpm

perl-MailTools-1.74-1.c4.noarch.rpm

perl-Net-IMAP-Simple-1.16-1.c4.noarch.rpm

perl-Net-IMAP-Simple-SSL-1.3-1.c4.noarch.rpm

I was required to add up the following RPMS:

perl-IO-Socket-SSL-1.01-1.c4.noarch.rpm

perl-Net-SSLeay-1.25-3.rf.i386.rpm

perl-TimeDate-1.16-1.c4.noarch.rpm

I have added DRBD RPMS, obtained from YUM:

drbd-0.7.21-1.c4.i386.rpm

kernel-module-drbd-2.6.9-42.EL-0.7.21-1.c4.i686.rpm (Note: Make sure the module version fits your kernel!)

As soon as I finished searching for dependent RPMS, I was able to install them all in one go, and so I did.

Configuring DRBD:

DRBD was a tricky setup. It would not accept missing destination node, and would require me to actually lie. My /etc/drbd.conf looks as follows (thanks to the great assistance of linux-ha.org):

resource web {
protocol C;
incon-degr-cmd “echo ‘!DRBD! pri on incon-degr’ | wall ; sleep 60 ; halt -f”; #Replace later with halt -f
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; } # or panic, …
syncer {
group 0;
rate 80M; #1Gb/s network!
}
on p800old {
device /dev/drbd0;
disk /dev/VolGroup00/drbd-src;
address 1.2.3.4:7788; #eth0 network address!
meta-disk /dev/VolGroup00/drbd-meta[0];
}
on node2 {
device /dev/drbd0;
disk /dev/sda1;
address 192.168.99.2:7788; #eth0 network address!
meta-disk /dev/sdb1[0];
}
}

I have had two major problems with this setup:

1. I had no second node, so I left this “default” as the 2nd node. I never did expect to use it.

2. I had no free space (non-partitioned space) on my disk. Lucky enough, I tend to install Centos/RH using the installation defaults unless some special need arises, so using the power of the LVM, I have disabled swap (swapoff -a), decreased its size (lvresize -L -500M /dev/VolGroup00/LogVol01), created two logical volumes for DRBD meta and source (lvcreate -n drbd-meta -L +128M VolGroup00 && lvcreate -n drbd-src -L +300M VolGroup00), reformatted the swap (mkswap /dev/VolGroup00/LogVol01), activated the swap (swapon -a) and formatted /dev/VolGroup00/drbd-src (mke2fs -j /dev/VolGroup00/drbd-src). Thus I have now additional two volumes (the required minimum) and can operate this setup.

Solving the space issue, I had to start DRBD for the first time. Per Linux-HA DRBD Manual, it had to be done by running the following commands:

modprobe drbd

drbdadm up all

drbdadm — –do-what-I-say primary all

This has brought the DRBD up for the first time. Now I had to turn it off, and concentrate on Heartbeat:

drbdadm secondary all

Heartbeat settings were as follow:

/etc/ha.d/ha.cf:

use_logd on #?Or should it be used?
udpport 694
keepalive 1 # 1 second
deadtime 10
initdead 120
bcast eth0
node p800old #`uname -n` name
crm yes
auto_failback off #?Or no
compression bz2
compression_threshold 2

I have also created a relevant /etc/ha.d/haresources, although I’ve never used it (this file has no importance when using “crm=yes” in ha.cf). I did, however, use it as a source for /usr/lib/heartbeat/haresources2cib.py:

p800old IPaddr::1.2.3.10/8/1.255.255.255 drbddisk::web Filesystem::/dev/drbd0::/mnt::ext3 httpd

It is clear that the virtual IP will be 1.2.3.10 in my class A network, and DRBD would have to go up before mounting the storage. After all this, the application would kick in, and would bring up my web page. The application, Apache, was modified beforehand to use the IP 1.2.3.10:80, and to search for DocumentRoot in /mnt

Running /usr/lib/heartbeat/haresources2cib.py on the file (no need to redirect output, as it is already directed to /var/lib/heartbeat/crm/cib.xml), and I was ready to go.

/etc/init.d/heartbeat start (while another terminal is open with tail -f /var/log/messages), and Heartbeat is up. It took it few minutes to kick the resources up, however, I was more than happy to see it all work. Cool.

The logic is quite simple, the idea is very basic, and as long as the system is being managed correctly, there is no reason for it to get to a dangerous state. Moreover, since we’re using DRBD, Split Brain cannot actually endanger the data, so we get compensated for the price we might pay, performance-wise, on a real two-node HA environment following these same guidelines.

I cannot express my gratitude to http://www.linux-ha.org, which is the source of all this (adding up with some common sense). Their documents are more than required to setup a full working HA environment.

Similar Posts

3 Comments

  1. Kudos for that writeup, you turned up #1 for drbd single node and I think I found anything I needed.

    And one more thing – single node clusters don’t just rock for testing, they have other big advantages:

    – Run all your applications under control of a ressource manager & monitor, so get automatic restart and a standardized notification framework
    – If you run all your systems as (even 1-node clusters) then you’ll have no operational differences, which saves a lot on admin errors
    and, the best thing:
    – If you run your app in a single node cluster you can easily turn it into a HA cluster once needed, without any downtime.
    So i.e. once your user decides “oh gosh, this is a critical app” then you’ll just define a second node and add it in.
    Tada, immediate failover support.

    They won’t pay for 24×7 HA any longer? well, just remove the 2nd node again.

    Have fun with your clusters – and go check out Veritas Cluster suite (there’s a 30-day demo) if you wanna see some real cluster power!
    (I used heartbeat and VCS – heartbeat is nice as long as things go well, but not quite… telco grade if you get me 🙂
    VCS on the other hand is completely cost-prohibitive, but it doesn’t hurt to know both options…

    1. Thanks. And now for some comments:
      1. You are correct. There are other uses for single node cluster. I specified only a single consideration with setting up a single-node cluster, and thanks for you bringing up other issues.
      2. I have had fun with many more clusters. VCS, HACMP, SunCluster, RedHat Cluster, Linux HA, and even MS Cluster (2000&2003). So I could say I am familiar with other clusters. Linux HA would not be my recommendation nowadays. I recommend RedHat Cluster, and I implement both it and VCS. When there are things beyond VCS ability to perform, there is hardly any way to make it do so (and I have seen some very twisted logic going in some of its implementations to override its limitations). It’s easier, on many levels, with RedHat Cluster.
      I will return your advice:
      Have fun with your clusters – go and check out RedHat Cluster (you can get it from your local RHEL installation media) if you wanna see some real cluster power!
      And RHCS is very much telco grade. You are most invited to check it, and pop a question if you want any clarification about it.

      Thanks!
      Ez

  2. cool thanks – I had avoided it so far and thought it’s not really worth it.
    always interesting to have some assumptions corrected 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.