Posts Tagged ‘diskless systems’

Stateless Systems (diskless, boot from net) on RHEL5/Centos5

Thursday, July 21st, 2011

I have encountered several methods of doing stateless RedHat Linux systems. Some of them are over-sophisticated, and it doesn’t work. Some of them are too old, and you either have to fix half the scripts, or give up (which I did, BTW), and after long period of attempts, I have found my simple-yet-working-well-enough solution. It goes like that (drums please!)

yum install system-config-netboot

Doesn’t it sound funny? So simple, and yet – so working. So I have decided to supply you with the ultimate, simple, working guide to this method – how to make it work – fast and easy.

You will need the following:

  • RHEL5/Centos5 with ability to run yum client, and enough disk space to contain an entire system image, and more. Lots more. About it later. This machine will be called “server” in this guide.
  • A single “Golden Image” system – the base of your system-to-replicate. A system you will, with some (minor) modifications, use as the source of all your future stateless systems. Usually – resides on another machine (physical or virtual, doesn’t matter much. More about it later). This machine will be called “GI” in this guide.
  • A test system. Better be diskless, for assurance. Better also be able to boot from net, otherwise, we miss something here (although hybrid boot methods are possible, I will not discuss them here, for the time being). Can be virtual, as long as it is full hardware virtualization, as you cannot net-boot, except for the latest Xen Community versions, in PV mode. It will be called “net-client” or “client” in this guide.

Our flow:

  • Install required server-side tools
  • Setup server configuration parameters
  • Image the GI into the server
  • Run this and that
  • Boot our net-client happily

Let’s start!

On the server, run:

yum install -y system-config-netboot xorg-x11-xinit dhcp

and then, run:

chkconfig dhcpd on
chkconfig tftp on
chkconfig xinetd on
chkconfig nfs on

We will now perform configurations for the above mentioned services.

Your /etc/dhcpd.conf should look like this:

ddns-update-style interim;
ignore client-updates;

subnet $NETWORK netmask $NETMASK {
option routers              $GATEWAY;
option subnet-mask          $NETMASK;
option domain-name-servers  $DNS;
ignore unknown-clients;
filename “linux-install/pxelinux.0″;
next-server $SERVERIP;
}

You should replace all these variables with the ones matching your own layout. In my case

NETWORK=10.0.0.0
NETMASK=255.0.0.0
GATEWAY=10.200.1.1
DNS=10.100.1.4
SERVERIP=10.100.1.8

We will include hosts later. Notice that this DHCP server will refuse to serve unknown hosts. This will prevent the “Oops” factor…

We can start our DHCPD now. It won’t do anything, and it’s a good opportunity to test our configuration:

service dhcpd start

We need to create the directory structure. I have selected /export/NFSroots as my base structure. I will follow this structure here:

mkdir -p /export/NFSroots

We will proceed with the NFS server settings later.

Imaging the GI to the server is quite simple. We will begin by creating a directory in /export/NFSroots with the name of the imaged system type. In my case:

mkdir /export/NFSroots/centos5.6

However (and this is the tricky part!), we will create a directory under this location, called ‘root’. We will image the entire contents of our GI to this root directory. This is how this mechanism works, and I have no desire of bending it. So

mkdir /export/NFSroots/centos5.6/root

Now we copy the contents of the GI to this directory. There are several methods, but in this particular case, I have chosen to use ‘rsync’ over ‘ssh’. There are other methods just as good. Just one note – on the GI, before this copy, we will need to have busybox-anaconda package. So make sure you have it:

yum install -y busybox-anaconda

Now we can create an image from the GI:

rsync -e ssh -av –exclude ‘/proc/*’ –exclude ‘/sys/*’ –exclude ‘/dev/shm/*’ $GI_IP:/ /export/NFSroots/centos5.6/root

You must include as many exclude entries as required. You should never cause the system to attempt to grab the company’s huge NFS share, just because you have forgotten some ‘exclude’ entry. This will cause major loss of time, and possibly – some outage to some network resource. Be careful!

This is the best time to grub a cup of coffee/tee/cocoa/coke/whatever, and chit-chat with your friends. I have had about 10 minutes for 1.8GB image, via 1Gb/s network link, so you can do some math and guesswork, and probably – go grab launch.

When this operation is complete, your next mission is to configure your NFS server. Order does matter. You should create a read-only entry for the image root directory, and a full-access entry for the above structure. Well – I can probably narrow it down, but I did not really bother. During pre-production, I will test how to narrow permissions down, without getting myself into management hell. So – our entries are looking like this in /etc/exports :

/export/NFSroots/centos5.6/root *(ro,no_root_squash)
/export/NFSroots *(rw,no_root_squash,async)

I would love to hear comments by people who did narrow security down to the required minimum, and how they managed to handle tenths and hundreds of clients, or several dozens of RO images without much hassle. Really. Comment, people, if you have something to add. I would love to modify this document accordingly.

Now we need to start our NFS server:

service nfs start

We are ready for the big entry. You will need GUI here. We have installed xorg-x11-xauth which will allow us, if invoked, remote X. I use X over SSH, so it’s a breeze. Run the command:

system-config-netboot

I will not bother with screenshots, as this is not my style, but the following entries will be required:

  • First Time Druid: Diskless
  • Name: Easy identified. In my case: Centos5.6
  • Server IP address: $SERVERIP
  • Directory: /export/NFSroots/centos5.6
  • Kernel: Select the desired kernel
  • Apply

The system might take several minutes to complete. When done, you will be presented with a blank table. We need to populate it now.

For that to be easy, we better have all our clients in /etc/hosts file. While DNS does help, a lot, this specific tool, for it to work like charm, requires /etc/hosts to have the entry, or else, it just won’t be very readable. Below is a one-liner script to create a set of entries in /etc/hosts. I have started with .100 and completed with .120. This can be changed easily to match your requirements:

for i in {100..120} ; do echo “10.100.1.$i   pxe${i}” >> /etc/hosts ; done

This way we can refer to our clients as pxe100, pxe101, and so on.

Let’s create a new client!

Select “new” from the GUI menu, and fill in the name ‘pxe100′. If you have multiple images, this is a good time to select which one you will be using. If you have followed this guide to the letter, you have only a single option. Press on “OK” and you’re done.

We need to setup MAC and IP addresses into the DHCP server. I have written a script which assists with this greatly. For this script to work, you will need to run the following commands:

echo “” > /etc/dhcpd.conf.bootnet
echo ‘include “/etc/dhcpd.d/dhcpd.conf.bootnet”;’ >> /etc/dhcpd.conf

#!/bin/bash
# Creates and removes netboot nodes
 
# Check arguments
 
# VARIABLES
DHCPD_CONF=/etc/dhcpd.conf.bootnet
SERVICE=dhcpd
DATE=`date +"%H%M%S%d%m%y"`
BACKUPS=${DHCPD_CONF}.backups
 
function show_help() {
	echo "Usage: $0"
	echo "add NAME MAC IP"
	echo "where NAME is the name of the node"
	echo "MAC is the hardware address of the netboot interface (usually eth0)"
	echo "IP is the designated IP address of the node"
	echo
	echo
	echo "del ARGUMENT"
	echo "Where ARGUMENT can be either name, MAC address or IP address"
	echo "The script will attempt to remove whatever match it finds, so be careful"
	echo
	echo
	echo "check ARGUMENT"
	echo "Where ARGUMENT can be either name, MAC address or IP address"
        echo "The script will list any matching entries. This is a recommended"
	echo "action prior to removing a node"
	echo "The same logic used for finding entries to remove is used here"
	echo "so it is rather safe to run 'del' after a successful 'check' operation"
	echo
	exit 0
}
 
function check_inside() {
	# Will be used to either show or find the amount of matching entries"
	# Arguments: 	$1 - silent - 0 ; loud - 1
	# 		$2 - the string to search
	# The function will search the string as a stand-alone word only
	# so it will not find abc in abcde, but only in "abc"
 
	case "$1" in
		0) 	# silent
			RET=`grep -iwc $2 $DHCPD_CONF`
			;;
		1) 	# loud
			grep -iw $2 $DHCPD_CONF
			;;
 
		*)	echo "Something is very wrong with $0"
			echo "Exiting now"
			exit 2
			;;
	esac
 
	return $RET
}
 
function add_to() {
	# This function will add to the conf file
	# Arguments: $1 host ; $2 MAC ; $3 IP address
	echo "host $1 { hardware ethernet $2; fixed-address $3; }" >> $DHCPD_CONF
	[ "$?" -ne "0" ] && echo "Something wrong happened when attempted to add entry" && exit 3
}
 
function del_from() {
	# This function will delete a matching entry from the conf file
	# Arguments: $1 - expression
	[ -z "$1" ] && echo "Missing argument to del function" && exit 4
	if cat $DHCPD_CONF | grep -vw $1 > $DHCPD_CONF.$$
	then
		\mv $DHCPD_CONF.$$ $DHCPD_CONF
	else
		echo "Failed to create temp DHCPD file. Aborting"
		exit 5
	fi
}
 
function backup_conf() {
	cp $DHCPD_CONF $BACKUPS/`basename $DHCPD_CONF.$DATE`
}
 
function restore_conf() {
	\cp $BACKUPS/`basename $DHCPD_CONF.$DATE` $DHCPD_CONF
}
 
function check_wrapper() {
	# Perform check. Loud one
	echo "Searching for $REGEXP in configuration"
	check_inside 1 $REGEXP
	exit 0
}
 
function del_wrapper() {
	# Performs delete
	[ -z "$REGEXP" ] && echo "Missing argument for delete action" && exit 6
	backup_conf
	echo "Removing all invocations which include $REGEXP from config"
	del_from $REGEXP
 
	if /sbin/service $SERVICE restart
        then
                echo "Done"
        else
                restore_conf
                /sbin/service $SERVICE restart
                echo "Failed to update. Syntax error"
        fi
}
 
function add_wrapper() {
	# Adds to config file
	[ -z "$NAME" -o -z "$MAC" -o -z "$IP" ] && echo "Missing argument for add action" && exit 7
 
	for i in $NAME $MAC $IP
	do
		if check_inside 0 $i
		then
			echo -n .
		else
			echo "Value $i already exists"
			echo "Will not update duplicate value"
			exit 7
		fi
	done
	echo
 
	backup_conf
	add_to $NAME $MAC $IP
 
	if /sbin/service $SERVICE restart
	then
		echo "Done"
	else
		restore_conf
		/sbin/service $SERVICE restart
		echo "Failed to update. Syntax error"
	fi
}
 
function prep() {
	# Make sure everything is as expected
	[ ! -d ${BACKUPS} ] && mkdir ${BACKUPS}
}
 
prep
 
case "$1" in
	add)	NAME="$2"
		MAC="$3"
		IP="$4"
		add_wrapper
		;;
	check)	REGEXP="$2"
		check_wrapper
		;;
	del)	REGEXP="$2"
		del_wrapper
		;;
	help)	show_help
		;;
	*)	echo "Usage: $0 [add|del|check|help]"
		exit 1
		;;
esac

This script will update your /etc/dhcpd.conf.bootnet with new nodes. Place it somewhere in your path (for example: /usr/local/bin/ )

We will need the MAC address of a node. Run

net-node.sh add pxe100 00:16:3E:00:82:7A 10.100.1.100

This will add the node pxe100 with that specific MAC address to the DHCPD configuration.

Now, all you need to do is boot your client, and see how it works. Remember to disable other DHCP servers which might serve this node, or blacklist its MAC address from them.

Our next chapters will deal with my (future) attempt to make RHEL6 work under this setup (as a GI, and client, not as a server), and all kind of mass-deployment methods. If and when :-)

DSL (Damn Small Linux) Diskless boot

Friday, August 31st, 2007

I have come across a requirement to boot a thin client on a very cheap hardware into Linux. Due to the tight hardware requirements, and the tight budget, I have decided to focus on diskless systems, which can be easily modified and purchased to our needs.

Not only that, but due to the hardware configuration (Via 333MHz, 128MB RAM, etc) I have decided to focus on a miniature Linux system.

I dislike re-doing what someone else has done, unless I can do it noticeably better. I have decided to use DSL (Damn Small Linux) as my system of choice, with only minor changes to fit my needs:

Out of the “box”, I was unable to find network-boot DSL. Quickly searching their site, the version which seemed to fit was the initrd-only system. I downloaded it from this mirror, but you can find it as the dsl-x.x.x-initrd.iso file.

Extracting the initrd from the ISO file is quite simple:

mkdir /mnt/iso
mount -o loop dsl-x.x.x-initrd.iso /mnt/iso

And from here you can just copy the contents of the directory /mnt/iso/boot/isolinux/ selectively to your tftpboot directory.

So I got 50MB initrd which worked just fine. Changing this was quite a procedure, because in addition to the steps per the wiki hacking guide, I was required to extract the KNOPPIX file outside of the initrd, and repackage it when done. Quite messy, however, stand-alone as soon as the system has been able to boot.

An alternate I have decided to investigate into was of booting into nfs mount, aka, accessing the KNOPPIX iso disk through NFS and not through CDROM.

I was able to find some leads in DSL forums at this page, which lead to this guide. I was able to download pxe boot image from Knoppix themselves, however, it was based on an old kernel (2.4.20-XFS) which was part of Knoppix 3.3 (cannot find it anymore) and although reached the level of actually booting my nfs, didn’t include enough network drivers (I wanted pcnet32 to be able to “play” with VMware for the task), and was incompatible with my existing DSL.

I had opened the supplied Knoppix initrd, and replaced the modules version to the one supplied with DSL – 2.4.24, per the rest of the system. In addition, I have added my required modules, etc, and was able to boot successfully both on VMware and on the thin client hardware.

To replace the modules, one needs to follow these general-only guidelines (these are not exactly step-by-step instructions):

Mount through loop the DSL KNOPPIX image, for example, in /mnt/dsl
Uncompress the Knoppix PXE initrd
Mount through loop the uncompressed Knoppix PXE initrd, for example, in /mnt/initrd
cd to /mnt/initrd/modules
Replace all modules in the current tree with the ones supplied by DSL, obtainable from /mnt/dsl/lib/modules/2.4.26 directory tree, including the cloop.o module
Umount the initrd image
Compress the initrd image
Boot using DSL linux and the new initrd image.

In order to boot successfully, you need to supply the pxe boot these two instructions:

nfsdir=nfs-server:/path/to/KNOPPIX directory

(since I was quite unsure about the letter case required, I have created a symlink from lower-case to upper case, so I had a link /mnt/KNOPPIX to a directory /mnt/knoppix, and inside this directory, a file called knoppix and a symbolic link to this file KNOPPIX. In my case, the exported path was /mnt/ only. Notice this one!).

BOOT_IMAGE=KNOPPIX – but you can have different KNOPPIX images for different purposes.

Finally it has worked correctly. Changes can be done only to the KNOPPIX iso image, per the hacking guide.

This is my PXE-enabled initrd, based on the text above, which fits DSL-3.4.1: minirt24.gz