Archive for December, 2009

NetApp SnapMirror monitor script

Sunday, December 13th, 2009

I have had some work done lately with NetApp SnapMirror. I have snapped-mirrored some volumes and qtrees and I wanted to monitor their use and behavior over the line.

As you can expect, site-to-site replication of data is a fragile thing, especially when done on the level of the storage device, which is agnostic to the data kept on it. When replicating volumes, I should expect the relevant employees to be responsible regarding what’s placed there, because the storage does not filter out the junk. If someone had decided to add a new DVD image on the DB storage space, well – the DB won’t care, as long as there is enough free space, but the storage will attempt to replicate the added data to the alternate site, which means that if you are around your bandwidth limits, which is never a good thing, you will just create a delay gap you would hardly (if at all) be able to close.

For that, and since I don’t tend to trust people not to do stupid things, I have written this script.

What does it do?

This script will perform the following:

Alerting about non-idle SnapMirror session

Use with ‘-m alert’

Assuming SnapMirror is scheduled to a specific time, the script will alert if a session is active. With the flag ‘-a no’, it will not send an e-mail (if possible, see the configuration section below). With ‘-r yes’, it will react, setting throttle for each non-idle session, but then ‘-t VALUE’ should be specified, where VALUE is the numeric throttle in KB/s.

Limiting throttle to a SnapMirror session

Use with ‘-m throttle_limit’

The script will set a throttle for SnapMirror session(s). Setting limit by the flag ‘-t VALUE’, where VALUE is the numeric throttle in KB/s per each session.

Cancelling throttle limit

Use with ‘-m throttle_unlimit’

The script will set unlimited throttle for SnapMirror session(s).

Checking SnapMirror lag

Use with the ‘-m check_lag’

Since replication has a purpose of recovering, the lag of each SnapMirror session would show how far back we are. Use with ‘-d VALUE’, VALUE being numeric time in minutes to set alert threshold. The default threshold delay is one day (1440 minutes).

Checking snapshots size

Use with the ‘-m check_size’

This reports the expected delta to transfer. This can help estimate the success or failure of a future sync of data (snapmirror update) before it begins. Use with ‘-l’ flag to set it to log date/time of measure and the expected sizes into a file. By default, in /tmp/target_name.txt, where the target is the SnapMirror target.

General Options

Use with ‘-c filename’ for alternate configuration file.

Use with ‘-h’ to get general help.

Use with a list target names in the format of storage:/vol/volname/qtree or storage:volname to ignore targets in configuration file and use your own.

Configuration File

The configuration file is rather simple. By default it should be called “/etc/snapmirror_monitor.conf“. It consists of two main variables for the system:

TGTS=”storage2:/vol/volname/qtree

storage3:volname2

storage1:/vol/volnew/qtr2″

EMAIL=”user@domain.com another_user@domain.com”

Prerequisites

This script will run on any modern Linux machine. For it to communicate with the NetApp devices, you will need SSH enabled on the NetApps, and ssh key exchange so that the Linux would be able to access the NetApp without using passwords.

The Script

Below is the script. You can download it and use it as you like.

#!/bin/bash
# This script will monitor snapmirror status
# Assumption: Access through ssh to root on all storage devices involved
# This will also attempt to detect the diff which is to sync

# Written by Ez-Aton. Check http://run.tournament.org.il for updates or
# additional information

# Modes: 
# alert -> alert if snapmirror is still active
# throttle_limit -> Limit throttle to a given number (default or manually set)
# throttle_unlimit -> Open throttle limitation
# check_lag -> Report the snapmirror lage
# check_size -> Report the estimated data size to move

# Global variables
CONF=/etc/snapmirror_monitor.conf
LOG_PREFIX=/tmp

test_connection () {
        # Test to see that you can access the storage device
        # Arguments: NetApp name
        SSH_OPTS="-o ConnectTimeout=2"
        if ! ssh $SSH_OPTS $1 hostname &>/dev/null
        then
                echo "Cannot communicate via SSH to $1"
                exit 1
        fi
}

abort () {
        # Exit with a predefined error message
        echo $*
        exit 1
}

get_arguments () {
        # Get all arguments and define options
        # Argument: $@
        [ -z "$1" ] && set -- -h
        while [ -n "$1" ]
        do
                case "$1" in
                        -m)     shift
                                case "$1" in
                                        alert|throttle_limit|throttle_unlimit|check_lag|check_size)     MODE=$1
                                        ;;
                                        *)      abort "Mode is mandatory. Use -h flag to get list of avialable flags"
                                        ;;
                                esac
                                ;;
                        -a)     shift
                                case "$1" in
                                        [nN][oO])       NOMAIL=1
                                                        ;;
                                        *)              NOMAIL=0
                                                        ;;
                                esac
                                ;;
                        -r)     shift
                                case "$1" in
                                        [yY][eE][sS])   REACT=1
                                                        ;;
                                        *)              REACT=0
                                                        ;;
                                esac
                                ;;
                        -d)     shift
                                declare -i DELAY_TMP
                                DELAY_TMP=$1
                                [ "$DELAY_TMP" != "$1" ] && abort "Delay needs to be a number in minutes"
                                DELAY=$DELAY_TMP
                                ;;
                        -t)     shift
                                declare -i THROTTLE_TMP
                                THROTTLE_TMP=$1
                                [ "$THROTTLE_TMP" != "$1" ] && abort "Throttle needs to be a number"
                                THROTTLE=$THROTTLE_TMP
                                ;;
                        -c)     shift
                                [ -f "$1" ] || abort "Cannot find specified conf file"
                                CONF="$1"
                                ;;
                        -l)     LOG=1
                                ;;
                        -h)     echo "Usage: $0 -m [alert|throttle_limit|throttle_unlimit|check_lag|check_size] (-c CONF_FILE) [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "Alert if SnapMirror is still running: $0 -m alert [-a no] (-r yes) [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "Alert and throttle (react): $0 -m alert [-a no] -r yes -t [throttle_in_kb] [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "Throttle a running SnapMirror: $0 -m throttle_limit -t throttle_in_kb [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "Unlimit SnapMirror throttle: $0 -m throttle_unlimit [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "To check lag: $0 -m check_lag -d delay_in_minutes (-a no) [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                echo "To check delta: $0 -m check_size [tgt_filer:volume tgt_filer:/vol/vol/qtree]"
                                exit 0
                                ;;
                        *)      [ -z "$MODE" ] && abort "$0 mode required"
                                TGTS="$*"
                                ;;
                esac
                shift
        done
}

notify () {
        # Send an e-mail notification
        # Arguments: $@ - the subject
        # Contents are empty
        # And yes - one e-mail per event
        mail -s "$@" $EMAIL /dev/null #Checks if the snapmirror is idle. If so, return true
        return $?
}

set_throttle () {
        # Sets throttle for target
        # Arguments: $1 Target name (example: storage:/vol/volname/qtree)
        # Arguments: $2 throttle value (number)

        # Get the storage name out
        NETAPP=${1%%:*}
        test_connection $NETAPP #Verify this netapp is accessible
        ssh $NETAPP snapmirror throttle $2 $1
}

get_lag () {
        # Gets the lag of snapmirror relationship in minutes
        # Arguments: Target name (example: storage:/vol/volname/qtree)

        # Get the storage name out
        NETAPP=${1%%:*}
        test_connection $NETAPP #Verify this netapp is accessible
        LAG=`ssh $NETAPP snapmirror status $1 | tail -1 | awk '{print $4}'`
        # LAG is in hh:mm:ss. We need to transfer it to minutes only
        H=`echo $LAG | cut -f 1 -d :`
        M=`echo $LAG | cut -f 2 -d :`
        let M=$M+$H*60
        echo $M
}

check_size () {
        # Checks the size of the snapshot to copy (diff)
        # Arguments: Target name (example: storage:/vol/volname/qtree)

        # Get the storage name out
        NETAPP=${1%%:*}
        test_connection $NETAPP #Verify this netapp is accessible
        # Get source storage name and path
        SRC=`ssh $NETAPP snapmirror status $1 | tail -1 | awk '{print $1}'`
        # Get the source filer and vol name from that
        NETAPP=${SRC%%:*}
        SPATH=${SRC##*:}
        SPATH=`echo $SPATH | sed s/'/vol/'//`
        SPATH=${SPATH%%/*}

        test_connection $NETAPP # Verify the target NetApp is accessible
        SNAP=`ssh $NETAPP snap list -n $SPATH | grep snapmirror | tail -1 | awk '{print $4}'`
        DELTA=`ssh $NETAPP snap delta $SPATH $SNAP | tail -2 | head -1 | awk '{print $5}'`
        echo "Snap delta for $1 is $DELTA KB"  
        LOG_TARGET=`echo $1 | tr / _`.txt
        [ -n "$LOG" ] && echo "`date` $DELTA" >> $LOG_PREFIX/$LOG_TARGET
}


### MAIN ###
get_arguments $@
. $CONF &>/dev/null
# if e-mail is not set, don't try to send
[ -z "$EMAIL" ] && NOMAIL=1

[ -z "$TGTS" ] && abort "You need at least one snapmirror target"

case $MODE in
        alert)  if [ "$REACT" == "1" ]
                then
                        [ -z "$THROTTLE" ] && abort "When setting 'react' flag, you must specify throttle"
                fi
                for i in $TGTS
                do
                        if ! idle $i
                        then
                                echo -n "$i is not idle. "
                                [ "$NOMAIL" != "1" ] && notify "$i is not idle"
                                if [ "$REACT" == "1" ]
                                then
                                        echo -n "We are set to react. Limiting throttle"
                                        set_throttle $i $THROTTLE
                                fi
                                echo
                        fi
                done
                ;;
        throttle_limit) [ -z "$THROTTLE" ] && abort "Throttle requires throttle value"
                        for i in $TGTS
                        do
                                echo "Setting throttle for $i to $THROTTLE"
                                set_throttle $i $THROTTLE
                        done
                        ;;
        throttle_unlimit)       for i in $TGTS
                                do
                                        echo "Setting throttle for $i to unlimited"
                                        set_throttle $i 0
                                done
                        ;;
        check_lag)      [ -z "$DELAY" ] && DELAY=1440
                        for i in $TGTS
                        do
                                LAG=`get_lag $i`
                                if [ "$LAG" -gt "$DELAY" ]
                                then
                                        echo "Failure: The delay for $i is $LAG minutes"
                                        [ "$NOMAIL" != "1" ] && notify "$i is lagged $LAG minutes, above the threshold $DELAY"
                                else
                                        echo "Normal: The delay for $i is $LAG minutes"
                                fi
                        done
                        ;;
        check_size)     for i in $TGTS
                        do
                                check_size $i
                        done
                        ;;
        *)      echo "Option $MODE is not implemented yet"
                exit 0
                ;;
esac

Rapid-guide – Updating RedHat initrd

Saturday, December 5th, 2009

Warning: This is not the recommended method if you’re not sure you know what you’re doing.

Linux Initial Ram Disk (initrd) is a mechanism to perform disk-independent actions before attempting to mount the ‘/’ disk. These actions usually include loading disk drivers, setting up LVM or software RAID, etc.

The reason these actions are performed within initrd is that it is all based on Ram Disk loaded by the boot loader, and thus it breaks the loop of “how would I load storage drivers without storage access?”

It happens that due to some special even we need to modify it manually. To do so we need first to open it, and then to close it back in, replacing (backup the old one, will you?) the previous one.

This is rather simple. The tools used by us will be ‘gzip’ and ‘cpio’.

Lets begin.

First – create a temporary directory:

mkdir /tmp/initrd

Extracting

We have our temporary directory, so now, we need to extract the initrd into it. I assume the name of the file is /boot/initrd.img. You should replace my line with whatever the name of your initrd file:

cd /tmp/initrd

cat /boot/initrd.img | gzip -dc |  cpio -id

This will extract the contents of the initrd into /tmp/initrd.

Now you can edit its contents directly.

Package

To package initrd back in, we will need to perform the following actions.

Warning – before you do it, make sure you have an available copy of your original initrd file, in case you have created some damage.

cd /tmp/initrd

find . | cpio -o -H newc | gzip -9 > /boot/initrd.img

This line packages the initrd, and replaces the old one.

That’s all for today 🙂

Quickly install Xen Community Linux VM

Saturday, December 5th, 2009

On RHEL-type of systems, with virt-manager (libvirt), you can make use of virt-manager to easy your life. I, for myself, prefer to work with ‘xm‘ tools, but for the initial install, virt-manager is the quickest and most simple available tool.

To install a new Linux VM, all you need to follow this flow

Create an LV for your VM (I use LVs because it’s easier to manage). If not LV, use a file. To create an LV, run the following command

lvcreate -L 10G -n new_vm1 VolGroup00

I assume that the name you wish to grant is ‘new_vm1’ (better maintain order there, else you will find yourself with hundreds of small LVs you have no idea what to do with), and that the name of the volume group is ‘VolGroup00’. Change to different values to match your environment.

Next, make sure you have your ISO contents unpacked (you can use loop device) and exported via NFS (my favorite method).

To mount a CD/DVD ISO, you should use ‘mount’ command with the ‘loop’ options. This would look like this:

mount -o loop my_iso.iso /mnt/temp

Again, I assume the name of the ISO is my_iso.iso and that the target directory /mnt/temp is available.

Now, export your newly created directory. If you have NFS already running, you can either add to /etc/exports the newly mounted directory /mnt/temp and restart the ‘nfs’ service, or you could use ‘exportfs’ to add it:

exportfs -o no_root_squash *:/mnt/temp

would probably do the trick. I added ‘no_root_squash’ to make sure no permission/access problems present themselves during the installation phase. Test your export to verify it’s working.

Now you could begin your installation. Run the following command:

virt-install -n new_vm1 -r 512 -p -f /dev/VolGroup00/new_vm1 –nographics nfs://nfs_server:/mnt/temp

The name follows the ‘-n’ flag. The amount of RAM to give is 512MB. The -p means it’s paravirtualized. The -f shows which device will be the block device, and the last argument is the source of the installation. Do not use local files, as the VM installer should be able to access the installation source.

Following that, you should have a very nice TUI installation experience.

Now – let’s make this machine ‘xm’ compatible.

Currently, the VM is virt-manager compatible. It means you need virt-manager to start/stop it correctly. Since I prefer ‘xm’ commands, I will show you how to convert this machine to VM.

First – export its XML file:

virsh dumpxml new_vm1 > /tmp/new_vm1.xml

virsh domxml-to-native xen-xm /tmp/new_vm1.xml > /etc/xen/new_vm1

This should do the trick.

Now you can turn the newly created VM off, and remove the VM from virt-manager using

virsh undefine new_vm1

and you’re back to ‘xm’-only interface.