Archive for the ‘Virtualization’ Category

Redhat Cluster and Citrix XenServer

Thursday, April 9th, 2015

I wanted to write down a guide for RHCS on RHEL/Centos6 and XenServer.

If you want to do that, you need to go through two major challenges which you will encounter. I want to save on the search and sum it all up together here.

The first difficulty is the shared disk. In order to set up most common cluster scenarios, you will need a shared storage. You could either map the VMs to an iSCSI LUNs external to the environment, however, if you do not have such infrastructure (either because everything is based on SAS/FC, or you do not have the ability to set up iSCSI storage with reasonable level of availability), you will want XenServer to allow you to share the VDI between two VMs.

In order to do so, you will need to add a flag to all your pool’s XenServers, and to create the VDI in a specific method. First – the flag – you need to create a file in /etc/xensource called “allow_multiple_vdi_attach”. Do not forget to add it to all your XenServers:

touch /etc/xensource/allow_multiple_vdi_attach

Next, you will need to create your VDI as “raw” type. This is an example. You need to change the SR UUID to the one you use:

xe vdi-create sm-config:type=raw sr-uuid=687a023b-0b20-5e5f-d1ef-3db777ce7ae4 name-label=”My Raw LVM VDI” virtual-size=8GiB type=user

You can find Citrix article about it here.

Following that, you can complete your cluster setup and configuration. I will not add details about it here, as this is not the focus of this article. However, when it comes to fencing, you will need a solution. The solution I used was a fencing agent which was written specifically for XenServer using XenAPI, by using the agent called fence-xenserver. I did not use the fencing agents repository (which this page also points to), because I was unable to compile the required components to run on Centos6. They just don’t compile well. This is, however, a simple Python script which actually works.

In order to make it work, I did the following:

  • Extracted the archive (version 0.8)
  • Placed fence_cxs* in /usr/sbin, and removed their ‘.py’ suffix
  • Placed XenAPI.py as-is in /usr/sbin
  • Verified /usr/sbin/fence_cxs* had execution permissions.

Now, I needed to add it to the cluster configuration. Since the agent cannot handle accessing a non-pool master, it had to be defined for each pool member (I cannot tell in advance which of them is going to have the pool master role when a failover should happen). So, this is my cluster.conf relevant parts:

<fencedevices>
<fencedevice agent=”fence_cxs_redhat” login=”root” name=”xenserver01″ passwd=”password” session_url=”https://xenserver01″/>
<fencedevice agent=”fence_cxs_redhat” login=”root” name=”xenserver02″ passwd=”password” session_url=”https://xenserver02″/>
<fencedevice agent=”fence_cxs_redhat” login=”root” name=”xenserver03″ passwd=”password” session_url=”https://xenserver03″/>
<fencedevice agent=”fence_cxs_redhat” login=”root” name=”xenserver04″ passwd=”password” session_url=”https://xenserver04″/>
</fencedevices>
<clusternodes>
<clusternode name=”clusternode1″ nodeid=”1″>
<fence>
<method name=”xenserver01″>
<device name=”xenserver01″ vm_name=”clusternode1″/>
</method>
<method name=”xenserver02″>
<device name=”xenserver02″ vm_name=”clusternode1″/>
</method>
<method name=”xenserver03″>
<device name=”xenserver03″ vm_name=”clusternode1″/>
</method>
<method name=”xenserver04″>
<device name=”xenserver04″ vm_name=”clusternode1″/>
</method>
</fence>
</clusternode>
<clusternode name=”clusternode2″ nodeid=”2″>
<fence>
<method name=”xenserver01″>
<device name=”xenserver01″ vm_name=”clusternode2″/>
</method>
<method name=”xenserver02″>
<device name=”xenserver02″ vm_name=”clusternode2″/>
</method>
<method name=”xenserver03″>
<device name=”xenserver03″ vm_name=”clusternode2″/>
</method>
<method name=”xenserver04″>
<device name=”xenserver04″ vm_name=”clusternode2″/>
</method>
</fence>
</clusternode>
</clusternodes>

Attached xenserver-fencing-cluster.xml for clarity (WordPress makes a mess out of that)

Note that I used four (4) entries, since my pool has four hosts. Also note the VM name (it is case sensitive), and your methods – one for each host, since you don’t want them running in parallel, but one at a time. Failover time is between 5-15 seconds on my tests, depending on who is the actually pool master (xenserver04 takes the longest, obviously). I did not test it with pool master down (before or without HA kicking in), nor with the hosts down and thus TCP timeout is longer (than when attempting to connect a host which responds immediately that it is not the pool master). However, if ILO fencing takes about 30-60 seconds, I am not complaining about the current timeouts.

Clone corrupted disk in XenServer

Friday, October 31st, 2014

Following some unknown problems, I had recently several XenServer machines (different clusters, different sites and customers, and even different versions) with a VDI-END-of-File issues. It means that while you can start the VM correctly, perform XenMotion to another server you are unable to do any storage-migration task – neither Storage XenMotion, nor VDI copy or VM-move commands. In some cases, snapshots taken from the “ill” disks were misbehaving just the same. This is rather frustrating, because the way to solve it is by cloning the disk into a new one, and your hands are bound.

A method I have devised for the task is rather simple – Create a new VDI (on the target storage), map the original VDI and the new VDI to a domain0 machine, and copy using the ‘dd’ command, block-by-block. This is slow, thick, but it’s working.

How to do it? The steps, in general are:

  • Create a new VDI of the same size or larger than the original VDI.
  • Map the old and new VDIs UUID
  • Map the UUID of the control domain you intend to use for this task (it has to be which has access to both VDIs)
  • Turn off the ‘ill’ VM, mark the ‘ill’ VDI in a way you will be able to identify it easily (unique name label, for example), and unmap it from the VM
  • Create VBD for the VDI devices for the control domain, and plug them
  • Create Linux device file for the VBDs on the control domain
  • Perform ‘dd’ between the old and new disks (do not get confused with the direction, or you will overwrite your data!)
  • Unmap VBDs, destroy VBDs
  • Map the new VDI to the VM
  • Start the VM

I won’t go over the how to create a VDI. Use the XenCenter GUI to do it. Place it on the desired SR. Give it a noticeable name, so you would be able to recognise it

Get the UUID of the new VDI: xe vdi-list name-label=”The name label I used” | grep ^uuid | awk ‘{print $NF}’
Do the same to the source VDI. Use it’s name label, or use xe vbd-list to obtain its VDI UUID

Get the UUID of the control domain you want to use: xe vm-list is-control-domain=true

Unmap the VM’s VDI from it (after setting some very noticeable name for it, and noting the disk number/ID it had on the VM)

On the control domain, run:
xe vbd-create vdi-uuid=<‘Ill’ VDI UUID> vm-uuid=<Control domain UUID> device=xvda
This command will result in a UUID. Note this UUID, as the source device UUID.

Run again for the target VDI. This time, use device=xvdb

Note this UUID as well. This is the target UUID.

We need to connect the VBDs and create a device node for them:
xe vbd-plug uuid=<UUID of source VBD created above>

There is a new block device available to the XenServer host’s control domain. To identify the new device, we need to run now:
tail -1 /proc/partitions
The resulting line would look something like this:

253 10 40960000 tdk

The interesting fields are the first, the 2nd and the last. We will use them to create a block device using the command ‘mknod’:

mknod /dev/tdk253 10

The result will be a block device file called /dev/tdk with the major 253 and minor 10.

We will repeat the process for the target VBD, and here we have two additional disks on the control domain.

We can (and should) copy using dd from the source to the target (don’t mix it!). Assuming /dev/tdk is the source, and /dev/tdl is the target, it would look like this:

dd if=/dev/tdk of=/dev/tdl bs=1M oflag=direct

We are using oflag=direct to enforce direct writes and not to saturate the control domain’s caches.

Following the operation, to release the disks and get back to business, we do:

  • xe vbd-unplug uuid=<SOURCE VBD UUID>
  • xe vbd-destroy uuid=<SOURCE VBD UUID>
  • xe vbd-unplug uuid=<TARGET VBD UUID>
  • xe vbd-destroy uuid=<TARGET VBD UUID>
  • Map the new disk to the VM, to the correct device number
  • Start the VM

If it starts OK, we can destroy the old VDI and have a bowl. If it doesn’t, we can always map the previous (source) VDI to the VM, and start it anew.

I hope it helps.

XenServer and its damn too small system disks

Thursday, December 26th, 2013

I love XenServer. I love the product, I believe it to be a very good answer for SMBs, and enterprises. It lacks on external support, true, but the price tag for many of the ‘external capabilities’ on VMware, for instance, are very high, so many SMBs, especially, learn to live without them. XenServer gives a nice pack of features, at a very reasonable price.

One of the missing features is the management packs of hardware vendors, such as HP, Dell and IBM. Well, HP does have something, and its installation is always some sort of a challenge, but they do, so scratch that. Others, however, do not supply management packs. The bright side is that with Domain0 being a full featured i386 Centos 5 distribution, I can install the Centos/RHEL management packs, and have a ball. This brings us to another challenge there – the size of the system disk (root partition) by default is too small – 4GB, and while it works quite well without any external components, it tends to get filled very fast with external packages installed, like Dell tools, etc. Not only that, but on a system with many patches the patches backups take their toll, and consume valuable space. While my solution will not work for those who aim at the smallest possible space, such as SD or Disk-on-Key for the XenServer OS, it aims for the most of us all, where the system resides on several tenths of gigabytes at least, and is capable of sustaining the ‘loss’ of additional 4GB. This process modifies the install.img file, and authors the CD as a new one, your own privately-modified instance of XenServer installation. Mind you that this change will be effective only for new installations. I have not tested this as the upgrade path for existing systems, although I believe no harm will be done to those who upgrade. Also – it was performed and tested on XenServer 6.2, and not 6.2 SP1, or prior versions, although I believe that the process should look pretty similar in nature.

You will need a Linux machine to perform this operation, end to end. You could probably use some Windows applications on the way, but I have no idea as to which or what.

Step one: Open the ISO, and copy it to somewhere useful (assume /tmp is useful):

mkdir /tmp/ISO
mkdir /tmp/RW
mount -o loop /path/to/XenServer-6.2.0-install-cd.iso /tmp/ISO
cd /tmp/ISOtar cf – . | ( cd /tmp/RW ; tar xf – )

Step two: Extract the contents of the install.img file in the root of the CDROM:

mkdir /tmp/install
cd /tmp/install
cat /tmp/RW/install.img | gzip -dc | cpio -id

Step three: Edit the contents of the definitions file:

vi opt/xensource/installer/constants.py

Change the value of ‘root_size’ to something to your taste. Mind you that with 4GB it was tight, but still usable, even with additional 3rd party tools, so don’t become greedy. I defined it to be 6GB (6144)

Step four: Wrap it up:

cd /tmp/install ; find . | cpio -o -H newc | gzip -9 > /tmp/RW/install.img

Step five: Author the CD, and prepare it to be burned:

cd /tmp/RW
mkisofs -J -T -o /share/temp/XenServer-6.2-modified.iso -V “XenServer 6.2” -volset “XenServer 6.2” -A “XenServer 6.2”
-b boot/isolinux/isolinux.bin -no-emul-boot -boot-load-size 4 -boot-info-table -R -m TRANS.TBL .

You now have a file called ‘XenServer-6.2-modified.iso’ in your /tmp, which will install your XenServer with the disk partition size you have set it to install. Cheers.

BTW, and to make it entirely clear – I cannot be held responsible to any damage caused to any system you tweaked using this (or for that matter – any other) guide I published.

Enjoy your XenServer’s new apartment!

XenServer – increase LVM over iSCSI LUN size – online

Wednesday, September 4th, 2013

The following procedure was tested by me, and was found to be working. The version of the XenServer I am using in this particular case is 6.1, however, I belive that this method is generic enough so that it could work for every version of XS, assuming you're using iSCSI and LVM (aka - not NetApp, CSLG, NFS and the likes). It might act as a general guideline for fiber channel communication, but this was not tested by me, and thus - I have no idea how it will work. It should work with some modifications when using Multipath, however, regarding multipath, you can find in this particular blog some notes on increasing multipath disks. Check the comments too - they might offer some better and simplified way of doing it.

So - let's begin.

First - increase the size of the LUN through the storage. For NetApp, it involves something like:

lun resize /vol/XenServer/luns/SR1.lun +1t

You should always make sure your storage volume, aggregate, raid group, pool or whatever is capable of holding the data, or - if using thin provisioning - that a well tested monitoring system is available to alert you when running low on storage disk space.

Now, we should identify the LUN. From now on - every action should be performed on all XS pool nodes, one after the other.

cat /proc/partitions

We should keep the output of this command somewhere. We will use it later on to identify the expanded LUN.

Now - let's scan for storage changes:

iscsiadm -m node -R

Now, running the previous command again will have a slightly different output. We can not identify the modified LUN

cat /proc/partitions

We should increase it in size. XenServer uses LVM, so we should harness it to our needs. Let's assume that the modified disk is /dev/sdd.

pvresize /dev/sdd

After completing this task on all pool hosts, we should run sr-scan command. Either by CLI, or through the GUI. When the scan operation completes, the new size would show.

Hope it helps!

XenServer 6.2 is now Open Source!

Tuesday, June 25th, 2013

It is an amazing news to me. I really love XenServer. I think that Citrix were able to make a good use of Linux mechanisms for the purposes of virtualization, without abusing the OS layer (like some of the other virtualization solutions did). The file locations are decent (for example – most parts are located in /opt, which is the right place for it to be at), and in general, it always felt to me as if Citrix developers (and the original XenSource developers) had respect for the OS. I liked it, and still do.

The product was not perfect. There were ups and downs, there were times when I cursed it, and times when I was full of joy by its behavior – which can happen from time to time, if you really like and care about a software product.

So, today Citrix announced that XenServer 6.2, the shiny new release, will become fully open-sourced. That the entire feature-set of XenServer previous versions can be yours for free. However, for support and some minor administrative tasks, you will want to purchase the licensed version. As far as I understood, these are the differences. Almost nothing more. Wow. Kudos!

Grab it while it’s hot here!