Posts Tagged ‘bash’

ZFS with Redhat Cluster Suite

Friday, July 25th, 2014

This is a very nice project I have been working on. The hardware at hand – two servers, with a shared SAS bus containing several SAS disks. Since it’s a shared bus, no RAID solution would cut it, and as I don’t want to waste disks with ASM (“normal” redundancy meaning half the size…), I went to ZFS storage.

ZFS is a wonderful technology, with many advantages, but with some dangerous pitfalls. As I prefer Linux, I did not bother with any Sloaris solutions, and went directly to Centos 6. I will describe my cluster setup below.

I will disclose the entire setup, including hardware layout, Linux platform, ZFS module parameters, the Redhat Cluster Suite ZFS agent I wrote and the cluster.conf configuration file. I will also share my considerations regarding some of the choices I made. In addition, this system was designed to act as NFS storage for Citrix XenServer pool, so I will have to describe the changed I had to perform on the XenServer itself (which might make it unsupported, but I will have to live with it), to allow it to handle the timeouts resulting by server failover.

So first – the servers – each having a single CPU (quad core), 24GB RAM, and dual 1Gb/s NICs. Also – a tiny internal SATA disk is used for the OS. The shared disks – at the moment, 10 SAS disks, dual port (notice – older HP disks might mark in a very small letters that they are only a single-port SAS disks…), 72GB, 10K RPM. Zpool called ‘share’ with two 5 disks RaidZ1 vdevs. As I mentioned before – ZFS seemed like the best possible option allowing me to achieve my goals at minimal cost.

When I came to this project, I wanted to be able to use a native ZFS cluster agent, and not a ‘script’ agent, which takes a very long time to respond (30 seconds). Also – I wanted to be able to handle multiple storage pools concurrently – each floating on its own. While I have only one at the moment, I wanted the ability to have a fine-grained control over multiple pools. In addition – I am unable (or unwilling?) to handle the multiple filesystems introduced with each pool. I wanted to be able to import or export the pool silently, and with a clear head, thus I had to verify that the multiple filesystems are not in use as part of the export process.

As an agent, I wanted to comply with Redhat Cluster Suite (RHCS from now on) OCF syntax. I used the supplied fs.sh script as an inspiration for my agent script, so some of it might look familiar. All credit goes to the original authors, of course.

The operating system I selected was Centos 6. Centos is based on Redhat Linux, and I find it mature and stable, which is exactly what I want when I plan a production-ready, enterprise-class storage solution. The version had to be x86_64, due to ZFS requirements, and due to the amount of RAM in the server.

To handle ZFS options, I added a file called /etc/modprobe.d/zfs.conf, with the following content

install zfs /bin/rm -f /etc/zfs/zpool.cache && /sbin/modprobe –ignore-install zfs
options zfs zfs_arc_max=12593790976
options zfs zfs_arc_min=12593790976

I had to verify there is no zpool.cache file. Since my pool was rather small (planned for 24 disks max), I was not concerned by the longer import process caused by not having the zpool.cache file. I was more concerned with automatic import process which might happen, and had to prevent it at almost any cost. In addition, I learned from other systems that the arc memory should never exceed half the RAM, and it should be given just a little under that.

Of course, when changing such module settings, you need to recreate initrd (dracut -f) to be on the safe side later on.

The zfs.sh agent script was placed in /usr/share/cluster directory. You must have rgmanager installed for this directory to exist, and anyhow, without rgmanager, you will have no cluster whatsoever.

This is the contents of the zfs.sh file. Notice that it is not compatible with Luci, so if you’re using it – them kids won’t play well together.

#!/bin/bash
 
LC_ALL=C
LANG=C
PATH=/bin:/sbin:/usr/bin:/usr/sbin
export LC_ALL LANG PATH
# Private return codes
FAIL=2
NO=1
YES=0
YES_STR="yes"
 
. $(dirname $0)/ocf-shellfuncs
 
meta_data()
{
    cat <
 
    1.0
 
	This script will import and export ZFS storage pools
	It will make sure to mount and umount all child filesystems
 
        This is a ZFS pool
 
                Symbolic name for this zfs pool
 
                File System Name
 
		ZFS Pool name or ID
 
                ZFS pool name
 
		ZFS Pool alternate mount
 
                ZFS pool alternate mount
 
                If set, the cluster will kill all processes using 
                this file system when the resource group is 
                stopped.  Otherwise, the unmount will fail, and
                the resource group will be restarted.
 
                Force Unmount
 
                If set and unmounting the file system fails, the node will
                immediately reboot.  Generally, this is used in conjunction
                with force-unmount support, but it is not required.
 
                Seppuku Unmount
 
	<!-- Note: active monitoring is constant and supplants all              check depths -->
        <!-- Checks to see if we can read from the mountpoint -->
 
        <!-- Checks to see if we can write to the mountpoint (if !ROFS) -->
 
EOT
}
 
ocf_log()
{
        echo $*
}
 
verify_driver() {
	ocf_log info "Verifying ZFS driver"
	lsmod | grep -w zfs &gt; /dev/null 2&gt;&amp;1 &amp;&amp; return 0
	ocf_log err "ZFS driver is not loaded"
	return $OCF_ERR_ARGS
}
 
verify_poolname() {
	ocf_log info "Verify pool name "
	if [ -z "$OCF_RESKEY_pool" ]
	then
		ocf_log err "Missing pool name"
		return $OCF_ERR_ARGS
	fi
	zpool import | grep pool: | grep -w $OCF_RESKEY_pool &gt; /dev/null 2&gt;&amp;1 &amp;&amp; return 0
	ocf_log err "Cannot identify pool name"
	return $OCF_ERR_ARGS
}
 
verify_mounted_poolname() {
	ocf_log info "Verify pool name "
	if [ -z "$OCF_RESKEY_pool" ]
	then
		ocf_log err "Missing pool name"
		return $OCF_ERR_ARGS
	fi
	zpool list $OCF_RESKEY_pool &gt; /dev/null 2&gt;&amp;1 &amp;&amp; return 0
	ocf_log err "Cannot identify pool name"
	return $OCF_ERR_ARGS
}
 
verify_mountpath() {
	ocf_log info "Verifying alternate root mount path"
	[ -z "$OCF_RESKEY_mount" ] &amp;&amp; return 0
	declare mp="${OCF_RESKEY_mount}"
	case "$mp" in
		/*)    	# found it
                	;;
        	*)      # invalid format
			ocf_log err \
"verify_mountpath: Invalid mount point format (must begin with a '/'): \'$mp\'"
                return $OCF_ERR_ARGS
                ;;
        esac
}
 
pool_import() {
	ocf_log info "Importing pool"
	OPTS=""
	[ -n "$OCF_RESKEY_mount" ] &amp;&amp; OPTS="-R $OCF_RESKEY_mount"
	zpool import $OCF_RESKEY_pool $OPTS
	RET="$?"
	if [ "$RET" -ne "0" ]
	then
		ocf_log info "Cannot import without applying force"
		zpool import -f $OCF_RESKEY_pool $OPTS
		RET="$?"
	fi
	if [ "$RET" -ne "0" ]
	then
		ocf_log err "Pool import failed for $OCF_RESKEY_pool. error=$RET"
		return 1
	fi
	ocf_log info "Imported ZFS pool"
	return $RET
}
 
check_and_release_fs() {
	ocf_log info "Checking and releasing FS"
	FS=""
	case ${OCF_RESKEY_force_unmount} in
        $YES_STR|on|true|1)	force_umount=$YES ;;
        *)		        force_umount="" ;;
        esac
 
	RET=0
	for i in `zfs list -t filesystem | grep ^${OCF_RESKEY_pool} | awk '{print $NF}'`
	do
		# To be on the safe side. Why not?
		sleep 1
		# Is it mounted?
		if ! df -l | grep -w "$i" &gt; /dev/null 2&gt;&amp;1
		then
			ocf_log info "Filesystem $i is not mounted"
			continue
		fi 	
		if [ `lsof $i | wc -l` -gt "0" ]
		then
			ocf_log info "Filesystem $i is in use"
			if [ "$force_umount" ]
			then
				ocf_log info "Attempting to kill processes on $i filesystem"
				fuser -k $i
				sleep 2
				if [ `lsof $i | wc -l` -gt "0" ]
				then
					ocf_log err "Cannot umount filesystem $i - filesystem in use"
					return 1
				fi
			else
				ocf_log err "Cannot umount filesystem $i
 - filesystem in use"
                                return 1
			fi
		fi
	done
	return $RET	
}
 
self_fence() {
	ocf_log info "Should we validate and call self-fence?"
	case ${OCF_RESKEY_self_fence} in
		$YES_STR|on|true|1)       self_fence=$YES ;;
       		*)              self_fence="" ;;
        esac	
 
	if [ "$self_fence" ]; then
		ocf_log alert "umount failed - REBOOTING"
               	sync
                reboot -fn
	fi
	return $OCF_ERR_GENERIC
}
 
pool_export() {
	ocf_log info "Exporting zfs pool"
	check_and_release_fs || self_fence
	zpool export $OCF_RESKEY_pool
	RET="$?"
	if [ "$RET" -ne "0" ]
	then
		ocf_log err "Pool export failed for $OCF_RESKEY_pool. error=$RET"
		return 1
	fi
	return $RET
}
 
start() {
	ocf_log info "Starting ZFS"
	verify_driver || return $OCF_ERR_ARGS 
	verify_poolname || return $OCF_ERR_ARGS
	verify_mountpath || return $OCF_ERR_ARGS
	pool_import
	# Handle filesystem?
}
 
stop() {
	ocf_log info "Starting ZFS"
	verify_driver || return $OCF_ERR_ARGS 
	verify_mounted_poolname || return $OCF_ERR_ARGS
	verify_mountpath || return $OCF_ERR_ARGS
	# Handle filesystem?
	pool_export
}
 
is_imported() {
	ocf_log debug "Checking if $OCF_RESKEY_pool is imported"
	zpool list ${OCF_RESKEY_pool} &gt; /dev/null 2&gt;&amp;1
	return $?
}
 
is_alive() {
	ocf_log debug "Checking ZFS pool read/write"
	declare file=".writable_test.$(hostname)"
	declare TIMEOUT="10s"
	[ -z "$OCF_CHECK_LEVEL" ] &amp;&amp; export OCF_CHECK_LEVEL=0
	mount_point=`zfs list ${OCF_RESKEY_pool} | grep ${OCF_RESKEY_pool} | awk '{print $NF}'`
	test -d "$mount_point"
        if [ $? -ne 0 ]; then
                ocf_log err "${OCF_RESOURCE_INSTANCE}: is_alive: $mount_point is not a directory"
                return $FAIL
        fi
	[ $OCF_CHECK_LEVEL -lt 10 ] &amp;&amp; return $YES
 
        # depth 10 test (read test)
        timeout -s 9 $TIMEOUT ls "$mount_point" &gt; /dev/null 2&gt; /dev/null
        errcode=$?
        if [ $errcode -ne 0 ]; then
                ocf_log err "${OCF_RESOURCE_INSTANCE}: is_alive: failed read test on [$mount_point]. Return code: $errcode"
                return $NO
        fi
 
	[ $OCF_CHECK_LEVEL -lt 20 ] &amp;&amp; return $YES
 
        # depth 20 check (write test)
        rw=$YES
        for o in `echo $OCF_RESKEY_options | sed -e s/,/\ /g`; do
                if [ "$o" = "ro" ]; then
                        rw=$NO
                fi
        done
	if [ $rw -eq $YES ]; then
                file="$mount_point"/$file
                while true; do
                        if [ -e "$file" ]; then
                                file=${file}_tmp
                                continue
                        else
                                break
                        fi
                done
                timeout -s 9 $TIMEOUT touch $file &gt; /dev/null 2&gt; /dev/null
                errcode=$?
                if [ $errcode -ne 0 ]; then
                        ocf_log err "${OCF_RESOURCE_INSTANCE}: is_alive: failed write test on [$mount_point]. Return code: $errcode"
                        return $NO
                fi
                rm -f $file &gt; /dev/null 2&gt; /dev/null
        fi
 
	return $YES
}
 
monitor() {
	ocf_log debug "Checking ZFS pool $OCF_RESKEY_pool, Level $OCF_CHECK_LEVEL"
	verify_driver || return $OCF_ERR_ARGS 
	is_imported
	RET=$?
	if [ "$RET" -ne $YES ]; then
                ocf_log err "${OCF_RESOURCE_INSTANCE}: ${OCF_RESKEY_device} is not mounted on ${OCF_RESKEY_mountpoint}"
                return $OCF_NOT_RUNNING
        fi
	is_alive
	return $RET
}
 
if [ -z "$OCF_CHECK_LEVEL" ]; then
	OCF_CHECK_LEVEL=0
fi
 
case $1 in
start)
	ocf_log info "zfs start $OCF_RESKEY_pool\n"
	OCF_CHECK_LEVEL=0
	monitor
	[ "$?" -ne "0" ] &amp;&amp; start || ocf_log info "$OCF_RESKEY_pool is already mounted"
	exit $?
	;;
stop)
	ocf_log info "zfs stop $OCF_RESKEY_pool\n"
	OCF_CHECK_LEVEL=0
	monitor
	[ "$?" -eq "0" ] &amp;&amp; stop || ocf_log info "$OCF_RESKEY_pool is not mounted"
	exit $?
	;;
status|monitor)
	ocf_log debug "ZFS monitor $OCF_RESKEY_pool"
	monitor
	exit $?
	;;
meta-data)
	echo -e "zfs metadat $OCF_RESKEY_address\n" &gt;&gt;/tmp/out
	meta_data
	exit 0
	;;
validate-all)
	exit 0
	;;
*)
	echo "usage: $0 {start|stop|status|monitor|restart|meta-data|validate-all}"
	exit $OCF_ERR_UNIMPLEMENTED
	;;
esac

 

All I had to do now was to build the cluster.conf file.

This is a very nice project I have been working on. The hardware at hand – two servers, with a shared SAS bus containing several SAS disks. Since it’s a shared bus, no RAID solution would cut it, and as I don’t want to waste disks with ASM (“normal” redundancy meaning half the size…), I went to ZFS storage.

ZFS is a wonderful technology, with many advantages, but with some dangerous pitfalls. As I prefer Linux, I did not bother with any Sloaris solutions, and went directly to Centos 6. I will describe my cluster setup below.

I will disclose the entire setup, including hardware layout, Linux platform, ZFS module parameters, the Redhat Cluster Suite ZFS agent I wrote and the cluster.conf configuration file. I will also share my considerations regarding some of the choices I made. In addition, this system was designed to act as NFS storage for Citrix XenServer pool, so I will have to describe the changed I had to perform on the XenServer itself (which might make it unsupported, but I will have to live with it), to allow it to handle the timeouts resulting by server failover.

So first – the servers – each having a single CPU (quad core), 24GB RAM, and dual 1Gb/s NICs. Also – a tiny internal SATA disk is used for the OS. The shared disks – at the moment, 10 SAS disks, dual port (notice – older HP disks might mark in a very small letters that they are only a single-port SAS disks…), 72GB, 10K RPM. Zpool called ‘share’ with two 5 disks RaidZ1 vdevs. As I mentioned before – ZFS seemed like the best possible option allowing me to achieve my goals at minimal cost.

When I came to this project, I wanted to be able to use a native ZFS cluster agent, and not a ‘script’ agent, which takes a very long time to respond (30 seconds). Also – I wanted to be able to handle multiple storage pools concurrently – each floating on its own. While I have only one at the moment, I wanted the ability to have a fine-grained control over multiple pools. In addition – I am unable (or unwilling?) to handle the multiple filesystems introduced with each pool. I wanted to be able to import or export the pool silently, and with a clear head, thus I had to verify that the multiple filesystems are not in use as part of the export process.

As an agent, I wanted to comply with Redhat Cluster Suite (RHCS from now on) OCF syntax. I used the supplied fs.sh script as an inspiration for my agent script, so some of it might look familiar. All credit goes to the original authors, of course.

The operating system I selected was Centos 6. Centos is based on Redhat Linux, and I find it mature and stable, which is exactly what I want when I plan a production-ready, enterprise-class storage solution. The version had to be x86_64, due to ZFS requirements, and due to the amount of RAM in the server.

To handle ZFS options, I added a file called /etc/modprobe.d/zfs.conf, with the following content

install zfs /bin/rm -f /etc/zfs/zpool.cache && /sbin/modprobe –ignore-install zfs
options zfs zfs_arc_max=12593790976
options zfs zfs_arc_min=12593790975

I had to verify there is no zpool.cache file. Since my pool was rather small (planned for 24 disks max), I was not concerned by the longer import process caused by not having the zpool.cache file. I was more concerned with automatic import process which might happen, and had to prevent it at almost any cost. In addition, I learned from other systems that the arc memory should never exceed half the RAM, and it should be given just a little under that.

Of course, when changing such module settings, you need to recreate initrd (dracut -f) to be on the safe side later on.

The zfs.sh agent script was placed in /usr/share/cluster directory. You must have rgmanager installed for this directory to exist, and anyhow, without rgmanager, you will have no cluster whatsoever.

This is the contents of the zfs.sh file. Notice that it is not compatible with Luci, so if you’re using it – them kids won’t play well together.

All I had to do now was to build the cluster.conf file.

The reason I placed the IP address as the last to start and the first to stop was that the other way around, the NFS client would receive an ordered disconnection command, and would not bother to establish a connection with the remaining server. Abruptly taking away the clustered IP address causes the NFS clients to initiate a reconnection process, of which the systems are supposed to recover

I have left this article incomplete for a while now. It has some stuff I do like to share, so I am sharing it as-is. I will (some day) complete it.

Two advanced bash tricks

Saturday, June 7th, 2014

Well, tricks is not the right word to describe advanced shell scripting usage, however, it does make some sense. These two topics are relevant to Bash version 4.0 and above, which is common for all modern-enough Linux distributions. Yours probably.

These ‘tricks’ are for advanced Bash scripting, and will assume you know how to handle the other advanced Bash topics. I will not instruct the basics here.

Trick #1 – redirected variable

What it means is the following.

Let’s assume that I have a list of objects, say: ‘LIST=”a b c d”‘, and you want to create a set of new variables by these names, holding data. For example:

a=1
b=abc
c=3
d=$a

How can you iterate through the contents of $LIST, and do it right? If you’re having only four objects, you can live with stating them manually, however, for a dynamic list (example: the results of /dev/sd*1 in your system), you might find it a bit problematic.

A solution is to use redirected variables. Up until recently, the method involved a very complex ‘expr’ command which was unpleasant at best, and hard to figure at its worst. Now we can use normal redirected variables, using the exclamation mark. See here:

for OBJECT in $LIST
do
# Place data into the list
export $OBJECT=$RANDOM
done

for OBJECT in $LIST
do
# Read it!
echo ${!OBJECT}
done

Firstly – to assign value to the redirected variable, we must use ‘export’ prefix. $OBJECT=$RANDOM will not work.
Secondly – to show the content, we need to use exclamation mark inside the variable curly brackets, meaning we cannot call it $!OBJECT, but ${!OBJECT}.
We cannot dynamically create the variable name inside the curly brackets either, so ${!abc_$SUFFIX} won’t work either. We can create the name beforehand, and then use it, like this: DynName=abc_$SUFFIX ; echo ${!DynName}

Trick #2 – Using strings as an array index

It was impossible in the past, but now, one of the most useful features of having smart list is accessible in shell. We can now call an array with a label. For example:

for FILE in $( ls )
do
array["$FILE"]=$( ls -la $FILE | awk ‘{print $7}’ )
done

In this example we create array cells with the label being the name of the file, and populating them with the size (this is the result of ls -la 7th field) of this file.

This will work only if the array was declared beforehand using the following command (using the array name ‘array’ here):

declare -A array

Later on, it is easier to query data out of the array, as long as you know its index name. For example

FILE=ez-aton.txt
echo ${array[$FILE]}

Of course – assuming there is an entry for ez-aton.txt in this array.

The best use I found for this feature so far was for comparing large lists, without the need to reorder the objects in the array. I find it to boost the capabilities of arrays in Bash, and arrays, in general, are very powerful tools to handle long and complex lists, when you need to note the position.

That’s all fox. Note that the blog editor might change quites (single and double) and dashes to the UTF-8 versions, which will not go well in a copy/paste attempt to experiment with the code examples placed here. You might need to edit the contents and fix the quotes/dashes manually.

If you have any questions, comment here, I will be happy to elaborate. I hope to be able to add more complex Bash stuff I get into once a while :-)

XenServer – Setting virtual disks names based on the VM names

Wednesday, January 2nd, 2013

One of the worst things you can have in XenServer, is some wize-guy performing a ‘forget storage’ on a storage device still holding virtual disks related to VMs. As XenServer database is internal (for the whole pool) and not per-VM, all references to this virtual disks disappear, and you remain with bunch of VMs without disks, and later on, when the recovered from the shock and restored the SR, with a bunch of virtual disks you have no clue as to where they belong. Why? Because we are lazy, and we tend to skip the part where you can (or is it – should?) define a custom name for your virtual disks so you would know later on (for example – in the case specified above) where they belong(ed).

To solve this annoying issue, and to save time for Citrix XenServer admins, I have created a script which resets the VDI (virtual disk object) names to the name of the VM+ the logical position of the virtual disk (example: xvda, hdb, etc), related to the VM. That way, it will become very easy to identify the disks in case of such annoying micro-catastrophy (micro because no data is lost, just where it belongs…).

The script can be called manually, and since we’re lazy people, and we will forget to handle it manually every said interval, and will accumulate virtual machines with “Template of XYZ” virtual disks, it can be called from cron. When called manually, it asks the user to proceed by pressing ‘Enter’. If called from cron, it just runs.

Enjoy!

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#!/bin/bash
# This script will reset the names of the virtual disks used for each VM to a standard name, based on the VM name
# and the disk position
# It is meant to solve problems where due to 'forget storage' operations or the likes
# virtual disk associations disappear, and you face many disks having the same name
#
# Written by Ez-Aton: http://run.tournament.org.il
 
 
if [ -t 1 ]
then
        echo "This script will reset *all* VM disks to a name constructed of the VM and the disk name (xvda, hdb, etc)"
        echo "This operation is not reversible, however, it can be called repeatedly"
        echo "If you want this script to skip a said virtual disk, make sure its name includes the name of the VM"
        echo "For example 'vm1 the real important data disk' for a disk used by vm1."
        echo "Note that the name is case sensitive, and it is very important that to match the name using upper/lower case letters as needed"
        echo "To abort, press Ctrl+C"
        echo "To proceed, press Enter"
        read abc
fi
 
VM_LIST=`xe vm-list is-control-domain=false --minimal | tr , ' '`
 
for i in $VM_LIST
do
        # Resetting several parameters, so we have a clean start
        VM_NAME=""
        VBD_LIST=""
        VDI_LIST=""
        # We iterate through all existing VMs, to get both their names, and their disks
        VM_NAME="`xe vm-param-get uuid=$i param-name=name-label`"
        if [ -z "$VM_NAME" ]
        then
                # We have a problem with empty VM names, so we will use the VMs uuid
                VM_NAME=$i
        fi
        VBD_LIST=`xe vbd-list vm-uuid=$i --minimal | tr , ' '`
        for j in $VBD_LIST
        do
                # Resetting several parameters, so we have a clean start
                VDI_UUID=""
                DEV_NAME=""
                # We iterate through all existing VBDs to reset the VDI nane
                VDI_UUID=`xe vbd-param-get uuid=$j param-name=vdi-uuid`
                if [ "$VDI_UUID" == "<not in database>" ]
                then
                        # This is a virtual CDROM
                        continue
                fi
                DEV_NAME=`xe vbd-param-get uuid=$j param-name=device`
                VDI_NAME=`xe vbd-param-get uuid=$j param-name=vdi-name-label`
 
                # Test if the name was reset in the past or manually
                TGT_NAME="$VM_NAME $DEV_NAME"
                if [[ "$TGT_NAME" = "$VDI_NAME" ]]
                then
                        # There is nothing to do
                        echo "Name already includes VM name, so nothing to do"
                else
                        # Here we reset the VDI name
                        echo xe vdi-param-set uuid=$VDI_UUID name-label="$TGT_NAME"
                        xe vdi-param-set uuid=$VDI_UUID name-label="$TGT_NAME"
                fi
        done
done

XenServer get VM by MAC

Wednesday, December 5th, 2012

Using the GUI, it could be somewhat complex identifying a VM based on its MAC address. There are several solutions on the network using PowerShell, but I will demonstrate it using a simple bash script, below. Save, make executable, and run.

Enjoy

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/bash
if [ -z "$1" ]
then
	echo "Requires parameter - MAC address"
	exit 1
fi
 
MAC=$1
# You might want to check MAC correctness here. Enjoy doing it. RegExp, man!
 
# XenServer is agnostic to case for MAC addresses, so we don't care
VIF_UUID=`xe vif-list MAC=$MAC | grep ^uuid | awk '{print $NF}'`
 
VM=`xe vif-param-list uuid=$VIF_UUID | grep vm-name-label | awk '{print $NF}'`
 
echo "MAC $MAC has VM $VM"

Attach USB disks to XenServer VM Guest

Saturday, May 5th, 2012

There is a very nice script for Windows dealing with attaching XenServer USB disk to a guest. It can be found here.

This script has several problems, as I see it. The first – this is a Windows batch script, which is a very limited language, and it can handle only a single VDI disk in the SR group called “Removable Storage”.

As I am a *nix guy, and can hardly handle Windows batch scripts, I have rewritten this script to run from Linux CLI (focused on running from the XenServer Domain0), and allowed it to handle multiple USB disks. My assumption is that running this script will map/unmap *all* local USB disks to the VM.

Following downloading this script, you should make sure it is executable, and run it with the arguments “attach” or “detach”, per your needs.

And here it is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#!/bin/bash
# This script will map USB devices to a specific VM
# Written by Ez-Aton, http://run.tournament.org.il , with the concepts
# taken from http://jamesscanlonitkb.wordpress.com/2012/03/11/xenserver-mount-usb-from-host/
# and http://support.citrix.com/article/CTX118198
 
# Variables
# Need to change them to match your own!
REMOVABLE_SR_UUID=d03f247d-6fc6-a396-e62b-a4e702aabcf0
VM_UUID=b69e9788-8cd2-0074-5bc1-63cf7870fa0d
DEVICE_NAMES="hdc hde" # Local disk mapping for the VM
XE=/opt/xensource/bin/xe
 
function attach() {
        # Here we attach the disks
        # Check if storage is attached to VBD
        VBDS=`$XE vdi-list sr-uuid=${REMOVABLE_SR_UUID} params=vbd-uuids --minimal | tr , ' '`
        if [ `echo $VBDS | wc -w` -ne 0 ]
        then
                echo "Disks are allready attached. Check VBD $VBDS for details"
                exit 1
        fi
        # Get devices!
        VDIS=`$XE vdi-list sr-uuid=${REMOVABLE_SR_UUID} --minimal | tr , ' '`
        INDEX=0
        DEVICE_NAMES=( $DEVICE_NAMES )
        for i in $VDIS
        do
                VBD=`$XE vbd-create vm-uuid=${VM_UUID} device=${DEVICE_NAMES[$INDEX]} vdi-uuid=${i}`
                if [ $? -ne 0 ]
                then
                        echo "Failed to connect $i to ${DEVICE_NAMES[$INDEX]}"
                        exit 2
                fi
                $XE vbd-plug uuid=$VBD
                if [ $? -ne 0 ]
                then
                        echo "Failed to plug $VBD"
                        exit 3
                fi
                let INDEX++
        done
}
 
function detach() {
        # Here we detach the disks
        VBDS=`$XE vdi-list sr-uuid=${REMOVABLE_SR_UUID} params=vbd-uuids --minimal | tr , ' '`
        for i in $VBDS
        do
                $XE vbd-unplug uuid=${i}
                $XE vbd-destroy uuid=${i}
        done
        echo "Storage Detached from VM"
}
case "$1" in
        attach) attach
                ;;
        detach) detach
                ;;
        *)      echo "Usage: $0 [attach|detach]"
                exit 1
esac

 

Cheers!

Bonding + VLAN tagging + Bridge – updated

Wednesday, April 25th, 2012

In the past I hacked around a problem with the order of starting (and with several bugs) a network stack combined of network bonding (teaming) + VLAN tagging, and then with network bridging (aka – Xen bridges). This kind of setup is very useful for introducing VLAN networks to guest VMs. This works well on Xen (community, Server), however, on RHEL/Centos 5 versions, the startup scripts (ifup and ifup-eth) are buggy, and do not handle this operation correctly. It means that, depending on the update release you use, results might vary from “everything works” to “I get bridges without VLANs” to “I get VLANs without bridges”.

I have hacked a solution in the past, modifying /etc/sysconfig/network-scripts/ifup-eth and fixing some bugs in it, however, both maintaining the fix on every release of ‘initscripts’ package has proven, well, not to happen…

So, instead, I present you with a smarter solution, better adept to updates supplied from time to time by RedHat or Centos, using predefined ‘hooks’ in the ifup scripts.

Create the file /sbin/ifup-pre-local with the following contents:

 

#!/bin/bash
# $1 is the config file
# $2 is not interesting
# We will start the vlan bonding before any bridge
 
DIR=/etc/sysconfig/network-scripts
 
[ -z "$1" ] &amp;&amp; exit 0
. $1
 
if [ "${DEVICE%%[0-9]*}" == "xenbr" ]
then
    for device in $(LANG=C egrep -l "^[[:space:]]*BRIDGE=\"?${DEVICE}\"?" /etc/sysconfig/network-scripts/ifcfg-*) ; do
        /sbin/ifup $device
    done
fi

You can download this scrpit. Don’t forget to change it to be executable. It will call ifup for any parent device of xenbr* device called at. If the parent device is already up, no harm is done. If the parent device is not up, it will be brought up, and then the xenbr device can start normally.

XenServer create snapshots for all machines

Friday, August 7th, 2009

XenServer is a wonderful tool. One of the better parts of it is its powerful scripting language, powered by the ‘xe’ command.

In order to capture a mass of snapshots, you can either do it manually from the GUI, or scripted. The script supplied below will include shell functions to capture Quiesce snapshots, and it that fails, normal snapshots of every running VM on the system.

Reason: NetApp SnapMirror, or other backup (maybe for later export) scheduled actions.

#!/bin/bash
# This script will supply functions for snapshotting and snapshot destroy including disks
# Written by Ez-Aton
# Visit my web blog for more stuff, at http://run.tournament.org.il
 
# Global variables:
UUID_LIST_FILE=/tmp/SNAP_UUIDS.txt
 
# Function
function assign_all_uuids () {
	# Construct artificial non-indexed list with name (removing annoying characters) and UUID
	LIST=""
	for UUID in `xe vm-list power-state=running is-control-domain=false | grep uuid | awk '{print $NF}'`
	do
		NAME=`xe vm-param-get param-name=name-label uuid=$UUID | tr ' ' _ | tr -d '(' | tr -d ')'`
		LIST="$LIST $NAME:$UUID"
	done
	echo $LIST
}
 
function take_snap_quiesce () {
	# We attempt to take a snapshot with quench
	# Arguments: $1 name ; $2 uuid
	# We attempt to snapshot the machine and set the value of snap_uuid to the snapshot uuid, if successful.
	# Return 1 if failed
 
	if SNAP_UUID=`xe vm-snapshot-with-quiesce vm=$2 new-name-label=${1}_snapshot`
	then
		# echo "Snapshot-with-quiesce for $1 successful"
		return 0
	else
		echo "Snapshot-with-quiesce for $1 failed"
		return 1
	fi
}
 
function take_snap () {
	# We attempt to take a snapshot
	# Arguments: $1 name ; $2 uuid
	# We attempt to snapshot the machine and set the value of snap_uuid to the snapshot uuid, if successful.
	# Return 1 if failed
 
	if SNAP_UUID=`xe vm-snapshot vm=$2 new-name-label=${1}_snapshot`
	then
		#echo "Snapshot for $1 successful"
		echo $SNAP_UUID
		return 0
	else
		echo "Snapshot-with-quiesce for $1 failed"
		return 1
	fi
}
 
function stop_ha_template () {
	# Templates inherit their settings from the origin
	# We need to turn off HA
	# $1 : Template UUID
	if [ -z "$1" ]
	then
		echo "Missing template UUID"
		return 1
	fi
	xe template-param-set ha-always-run=false uuid=$1
}
 
function get_vdi () {
	# This function will get a space delimited list of VDI UUIDs of a given snapshot/template UUID
	# Arguments: $1 template UUID
	# It will also verify that each VBD is an actual snapshot
	if [ -z "$1" ]
	then
		echo "No arguments? We need the template UUID"
		return 1
	fi
	VDIS=""
	for VBD in `xe vbd-list vm-uuid=$1 | grep ^uuid | awk '{print $NF}'`
	do
		echo "VBD: $VBD"
		if [ ! `xe vbd-param-get param-name=type uuid=$VBD` = "CD" ]
		then
			CUR_VDI=`xe vdi-list vbd-uuids=$VBD | grep ^uuid | awk '{print $NF}'`
			if `xe vdi-param-get uuid=$CUR_VDI param-name=is-a-snapshot`
			then
				VDIS="$VDIS $CUR_VDI"
			else
				echo "VDI is not a snapshot!"
				return 1
			fi
			CUR_VDI=""
		fi
	done
	echo $VDIS
}
 
function remove_vdi () {
	# This function will get a list of VDIs and remove them
	# Carefull!
	for VDI in $@
	do
		if xe vdi-destroy uuid=$VDI
		then
			echo "Success in removing VDI $VDI"
		else
			echo "Failure in removing VDI $VDI"
			return 1
		fi
	done
}
 
function remove_template () {
	# This funciton will remove a template
	# $1 template UUID
	if [ -z "$1" ]
	then
		echo "Required UUID"
		return 1
	fi
	xe template-param-set is-a-template=false uuid=$1
	if ! xe vm-uninstall force=true uuid=$1
	then
		echo "Failure to remove VM/Template"
		return 1
	fi
}
 
function remove_all_template () {
	# This function will completely remove a template
	# The steps are as follow:
	# $1 is the UUID of the template
	# Calculate its VDIs
	# Remove the template
	# Remove the VDIs
	if [ -z "$1" ]
	then
		echo "No Template UUID was supplied"
		return 1
	fi
	# We now collect the value of $VDIS
	get_vdi $1
	if [ "$?" -ne "0" ]
	then
		echo "Failed to get VDIs for Template $1"
		return 1
	fi
	if ! remove_template $1
	then
		echo "Failure to remove template $1"
		return 1
	fi
	if ! remove_vdi $VDIS
	then
		return 1
	fi
}
 
function create_all_snapshots () {
	# In this function we will run all over $LIST and create snapshots of each machine, keeping the UUID of it inside a file
	# $@ - list of machines in the $LIST format
	if [ -f $UUID_LIST_FILE ]
	then
		mv $UUID_LIST_FILE $UUID_LIST_FILE.$$
	fi
	for i in $@
	do
		SNAP_UUID=`take_snap_quiesce ${i%%:*} ${i##*:}`
		if [ "$?" -ne "0" ]
		then
			echo "Problem taking snapshot with quiesce for ${i%%:*}"
			echo "Attempting normal snapshot"
			SNAP_UUID=`take_snap ${i%%:*} ${i##*:}`
			if [ "$?" -ne "0" ]
                	then
                        	echo "Problem taking snapshot for ${i%%:*}"
				SNAP_UUID=""
			fi
		fi
		stop_ha_template $SNAP_UUID
		echo $SNAP_UUID >> $UUID_LIST_FILE
	done
}

Possible use will be like this:

. /usr/local/bin/xen_functions.sh

create_all_snapshots `assign_all_uuids` &> /tmp/snap_create.log

Protect Vmware guest under RedHat Cluster

Monday, November 17th, 2008

Most documentation on the net is about how to run a cluster-in-a-box under Vmware. Very few seem to care about protecting Vmware guests under real RedHat cluster with a shared storage.

This article is just about it. While I would not recommend using Vmware in such a setup, it has been the case, and that Vmware guest actually resides on the shared storage. To relocate it is out of the question, so migrating it together with other resources is the only valid option.

To do so, I have created a simple script which will accept start/stop/status arguments. The Vmware guest VMX is hard-coded into the script, but in an easy-to-change format. This script will attempt to freeze the Vmware guest, and only if it fails, to shut it down. Mind you that the blog’s HTML formatting might alter quotation marks into UTF-8 marks which will not be understood well by shell.

#!/bin/bash
# This script will start/stop/status VMware machine
# Written by Ez-Aton
# http://www.tournament.org.il/run
 
# Hardcoded. Change to match your own settings!
VMWARE="/export/vmware/hosts/Windows_XP_Professional/Windows XP Professional.vmx"
VMRUN="/usr/bin/vmrun"
TIMEOUT=60
 
function status () {
  # This function will return success if the VM is up
  $VMRUN list | grep "$VMWARE" &amp;&gt;/dev/null
  if [[ "$?" -eq "0" ]]
  then
    echo "VM is up"
    return 0
  else
    echo "VM is down"
    return 1
  fi
}
 
function start () {
  # This function will start the VM
  $VMRUN start "$VMWARE"
  if [[ "$?" -eq "0" ]]
  then
    echo "VM is starting"
    return 0
  else
    echo "VM failed"
    return 1
  fi
}
 
function stop () {
  # This function will stop the VM
  $VMRUN suspend "$VMWARE"
  for i in `seq 1 $TIMEOUT`
  do
    if status
    then
      echo
    else
      echo "VM Stopped"
      return 0
    fi
    sleep 1
  done
  $VMRUN stop "$VMWARE" soft
}
 
case "$1" in
start)     start
        ;;
stop)      stop
        ;;
status)   status
        ;;
esac
RET=$?
 
exit $RET

Since the formatting is killed by the blog, you can find the script here: vmware1

I intend on building a “real” RedHat Cluster agent script, but this should do for the time being.

Enjoy!

Splitting archive and combining later on the fly

Wednesday, July 18th, 2007

Many of us use tar (many times with gzip or bzip2) for archiving purposes. When performing such an action, a large file, usually, too large, remains. To extract from it, or to split it becomes an effort.

This post will show an example of a small script to split an archive and later on, to directly extract the data out of the slices.

Let’s assume we have a directory called ./Data . To archive it using tar+gzip, we can perform the following action:

tar czf /tmp/Data.tar.gz Data

For verbose display (although it’s could slow down things a bit), add the flag ‘v’.

Now we have a file called /tmp/Data.tar.gz

Lets split it to slices sized 10 MB each:

cd /tmp
mkdir slices
i=1 # Our counter
skip=0 # This is the offset. Will be used later
chunk=10 # Slice size in MB
let size=$chunk * 1024 # And in kbytes
file=Data.tar.gz # Name of the tar.gz file we slice
while true ; do
# Deal with numbers lower than 10
if [ $i -lt "10" ]; then
j=0${i}
else
j=${i}
fi
dd if=${fie} of=slices/${file}.slice${j} bs=1M count=${chunk} skip=${skip}
# Just to view the files with out own eyes
ls -s slices/${file}.slice${j}
if [ `ls -s slices/${file}.slice${j} | awk '{print $1}'` -lt "${size}" ]; then
echo “Done”
break
fi
let i=$i+1
let skip=$skip+$chunk
done

This will break the tar.gz file to a files with running numbers added to their names. It assumes that the number of slices would not exceed 99. You can extend the script to deal with three digits numbers. The sequence is important for later. Stay tuned :-)

Ok, so we have a list of files with a numerical suffix, which, combined, include our data. We can test their combined integrity:

cd /tmp/slices
i=1
file=Data.tar.gz
for i in `ls`; do
cat ${file}.slice${i} >> ../Data1.tar.gz
done

This will allow us to compare /tmp/Data.tar.gz and /tmp/Data1.tar.gz. I tend to use md5sum for such tasks:

md5sum Data.tar.gz
d74ba284a454301d85149ec353e45bb7 Data.tar.gz
md5sum Data1.tar.gz
d74ba284a454301d85149ec353e45bb7 Data1.tar.gz

They are similar. Great. We can remove Data1.tar.gz. We don’t need it anymore.

To recover the contents of the slices, without actually wasting space by combining them before extracting their contents (which requires time, and disk space), we can run a script such as this:

cd /tmp/slices
file=Data.tar.gz
(for i in `ls ${file}.slice*`; do
cat $i
done ) | tar xzvf -

This will extract the contents of the joined archive to the current directory.

This is all for today. Happy moving of data :-)

RedHat / Centos Kickstart tweaks

Sunday, July 1st, 2007

Kickstart is a great method of hands-free installation of RHEL/Centos (and other derived systems). Its power is in its easy interface and rather powerful %post scripting directives. Its weakness is in its lack of flexibility where it comes to package selection and various custom actions.

On some cases, companies use web interface (usually home-made) which builds kickstart config files on-demand. On some cases, the administrator is required to build several kickstart config files for pre-anticipated setups.

I was looking for something which will give me the power to maintain a fixed configuration on one hand, and will allow me some tweaks and variants, when I want them. I could have used the %post scripting sections, but this gets quite complicated, especially when you want to add only one package (but with its dependencies), or you want to force full update of the system before it goes online, or even select its hostname, assuming it is not yet defined in the DNS.

I base my system on a simple DHCP/BootP + tftp server which answers to all bootp requests and offers a simple menu (just type a number and press on Enter). The original schema was quite simple: type 4 for Centos4.3, and then add -min if you wanted it to use a kickstart file with a minimum configuration. Then I wanted to add the option to update the system in an early stage, so I have added -update, which would have looked in the menu like “4-min-update” option. Quite readable, however, it generated lots of work when maintaining the pxelinux.cfg/default file and the ks themselves. Too many variations tend to require lots of care.

Adding parameters to the boot menu is possible, and would result in them existing in /proc/cmdline for later parsing.

I have decided to parse a set of predefined parameters supplied during boot time, and to change the kickstart config file according to them. It actually works quite well. This is a less-sophisticated and more of a stand-alone system compared to this system. Also, it doesn’t require me to alter the system’s boot process.

This is my ks.cfg file, which includes the flexibility additions:

# Kickstart file generated by Ez-Aton

install
nfs –server=install-server –dir=/mnt/samba/Centos
lang en_US.UTF-8
langsupport –default=en_US.UTF-8 en_US.UTF-8
keyboard us
skipx
network –device eth0 –bootproto dhcp
rootpw –iscrypted RpUKzjDc9k2gU
firewall –disabled
selinux –disabled
authconfig –enableshadow –enablemd5
timezone Asia/Jerusalem
bootloader –location=mbr

%packages
e2fsprogs
grub
lvm2
kernel
net-snmp
net-snmp-utils
kernel-devel
kernel-smp-devel
gcc

%pre
# By Ez-Aton http://www.tournament.org.il/run
for i in `cat /proc/cmdline`; do
echo $i >> /tmp/vars.tmp
done
grep “=” /tmp/vars.tmp > /tmp/vars
KS=/tmp/ks.cfg
update=”"
name=”"
pkg=”"
. /tmp/vars
if [ ! -z "$update" ]; then
echo “yum update -y” >> $KS
fi
if [ ! -z "$name" ]; then
value=”dhcp –hostname $name”
cat $KS | sed s/dhcp/”$value”/ > $KS.tmp
cat $KS.tmp > $KS
fi
if [ ! -z "$pkg" ]; then
pkg_line=`grep -n ^%packages $KS | cut -f 1 -d \:`
max_line=`wc -l $KS | awk ‘{print $1}’`
head -n $pkg_line $KS > $KS.tmp
for i in `echo $pkg | sed s/,/\ /g`; do
echo $i >> $KS.tmp
done
let tail_line=$max_line-$pkg_line
tail -n $tail_line $KS >> $KS.tmp
cat $KS.tmp > $KS
fi

%post

So, as you can see, I take the following parameters:

update=yes (it can be update=anything)

name=hostname (in case it cannot be retrieved from the DHCP server)

pkg=pkg1,pkg2,{pkg3,…} (To add specific packages to the installation)

It was tested to work on Centos4.3 system, and will probably work on RHEL and Centos versions 4.x all along. I didn’t test it on RHEL5/Centos5 yet.

If you use the script, please leave my name and blog URL in it. Also, if you modify it for your needs, I would be glad to get back the modifications you have made, to include them.

Enjoy.