Posts Tagged ‘Lilo’

Preperation of recovery server for RPM based systems

Monday, July 23rd, 2007

On most cases, when preparing a recovery server, you can just ‘tar’ the entire server’s contents and just move it, along with a short recipe on how to rebuild the original partition layout (software Raid? LVM? flat partition tables?), how to mount volumes in-place, how to extract the tar files into the right locations, and how to install your favorite boot loader, either Lilo or Grub. Also, beforehand, you deal with taking a nice snapshot (or capturing the system in a single-user phase), and life is good, yada yada yada.

Most of us, however, never prepare for a rainy day. It’s not that we don’t want to, it’s not that we don’t plan, it’s just that we never seem to get to it, and after all, the hardware is rather new, and there should be no reason for failure. I can guess some of you heard this before – maybe in their own voices.

So, backup is a tiring job, and I will not deal with the things you need to do to maintain a replica of your data, but I will deal with how to prepare, quickly and easily, a system-recovery server (or a postmortem server) with just a little thought beforehand. This might be a bit too late for you if you’re reading it now, but, well, for the next time…

This little trick worked (as part of a large-scale process) when I migrated a server from 64bit server to a 32bit server (yeah, I know – the other way around).

It assumes you use RPM as your tool to install applications, and that if you do not, you have a method of knowing which piece of software you installed from source, and which package was installed from an external source (not your day-to-day RPM repository).

On the source server, run ‘rpm -qa > /tmp/rpmlist.long‘. Keep this file. It is important. Also, try to keep your yum.repos.d directory, or at least know which rpm repositories you use (I always use rpmforge, so I see no problem with that).

Install your target server – Same version as the source, minimal package selection. Copy the file rpmlist.long to /tmp. Make sure yum is configured (I will deal here with YUM, but you can replace it with any other repository client of your choice). Run the two following lines:

cat /tmp/rpmlist.long | sed s/-[0-9].*$/”/g > /tmp/rpmlist-short

for i in `cat /tmp/rpmlist-short`; do

yum install -y $i

done

This will add the missing RPMs with their dependencies, and will bring your system to a similar status. At least, this is a good place to start recovering.

On future chapters:

– Fully migrating from 64 to 32 bit and vice versa

– Using LVM snapshots for a smart backup, and for a smart recovery

DL140 Generation 1 (G1) and LILO do not work

Wednesday, February 14th, 2007

I add to this blog all pieces of information which I might think could help other people.

One of the things I have encountered short while ago had to do with DL140 G1 system cannot boot Linux.

It’s Linux system (RedHat 4 32bit) was deployed by an external system and the system could not boot. However, when installed from CD, the system booted just fine. The symptom for this was that after booting, the screen showed:

LILO: Loading Linux…..

and that’s all. The system could have booted (and did so once) after several days, however, this is not really a desired status.

It seems to be an issue of Lilo and this specific hardware. Other systems (non DL140) were able to boot just fine using Lilo, and this same kernel version was bootable on that system through other means.

Replacing Lilo with GRUB during installation/deployment solved the isse. FYI.

Upgrading Ubuntu 6.06 to 6.10 with software non-standard RAID configuraion

Sunday, October 29th, 2006

A copy of a post made in Ubuntu forms, with the hope it would help others in status such as I’ve been in. Available through here

In afterthought, I think the header should be non-common, rahter than non-standard…

Hi all.

I have passed through this forum yesterday, searching for a possible cause for a problem I have had.
The theme was almost well known – after upgrade from 6.06 to 6.10 the system won’t boot. Lilo (I use Lilo, not Grub) would load the kernel, and after a while, never reachine "init" the system would wait for something, I didn’t know what.

I tried disconnecting USB, setting different boot parameters, even virified (using the kernel messages) that disks didn’t replace their locations (and they did not, although I have additional IDE controller). Alas, it didn’t seem good.

The weird thing is that the during this wait (there was keyboard response, but the system didn’t move on…), the disk LED flashed in a fixed rate. I though it might have to do with RAID rebuild, however, from live-cd, there was no signs of such a case.

Then, on one of the "live-cd" boots, accessing the system via "chroot" again, I have decied to open the initrd used by the system, in an effort to dig into the problem.
Using the following set of commands did it:
cp /boot/initrd.img-2.6.17-10-generic /tmp/initrd.gz
cd /tmp && mkdir initrd.out && cd initrd.out
gzip -dc ../initrd.gz | cpio -od

Following this, I’ve had the initrd image extracted, and I could look into it. Just to be on the safe side, I have looked into the image’s /etc, and found there mdadm/mdadm.conf file.
This file had different (wrong!) UUIDs for my software RAID setup (compared with "mdadm –detail /dev/md0" for md0, etc).
I have located the origin of this file to be the system’s real /etc/mdadm/mdadm.conf, which was originated a while ago, before I’ve made many manual modifications (changed disks, destroyed some of the md devices, etc). I have fixed the system’s real /etc/mdadm/mdadm.conf file to reflect the correct system, and recreated the initrd.img file for the system (now with the up-to-date mdadm.conf). Updated Lilo, and the system was able to boot correctly this time.

The funny thing is that even using the previous kernel, which had its initrd.img built long ago, and which worked fine for a long while failed to complete the boot process altogether using the upgraded system.

My system relevant details:

/dev/hda+/dev/hdc -> /dev/md2 (/boot), /dev/md3

/dev/hde+/dev/hdg -> /dev/md1
/dev/md1+/dev/md1 -> LVM2, including the / on it

Lilo is installed on /dev/hda and /dev/hdc altogether.

Why I don’t like GRUB – RHEL4 on system with local and external storage

Sunday, July 16th, 2006

Installed RHEL4 on a system with both internal storage (HP SmartAray 5i – cciss) and external disks through Qlogic FCS HBA.

During install, the local disks were detected as /dev/cciss/c0d0 while the external disks were detected as /dev/sda and /dev/sdb

After installation was done, Grub started with incorrect mapping. For no apparent reason, Grub searched for its stage2 and date in /dev/sda1 and not in /dev/cciss/c0d0p1.

The quickest way for me to solve it was to replace grub with Lilo (available on the fourth RHEL CD), correct /etc/lilo.conf.anaconda, copy this file to /etc/lilo.conf and run Lilo (with "-v" flag, for safety). It worked like a charm.

HP ML110 G3 and Linux Centos 4.3 / RHEL 4 Update 3

Tuesday, May 30th, 2006

Using the same installation server as before, my laptop, I was able to install Linux Centos 4.3, with the addition of HP’s drivers for Adaptec SATA raid controller, on my new HP ML110 G3.

Using just the same method as before, when I’ve installed Centos 4.3 on IBM x306, but with HP drivers, I was able to do the job easily.

To remind you the process of preparing the setup:

(A note – When I say "replace it with it" I always recommend you keep the older one aside for rainy days)

1. Obtain the floppy image of the drivers, and put it somewhere accessible, such as some easily accessible NFS share.

2. Obtain the PXE image of the kernel of Centos4.1 or RHEL 4 Update 1, and replace your PXE kernel with it (downgrade it)

3. Prepare the driver’s RPM and Centos 4.1 / RHEL 4 Update 1 kernel RPM handy on your NFS share.

4. Do the same for the PXE initrd.img file.

5. Obtain the /Centos/base/stage2.img file from Centos 4.1 or RHEL 4 Update 1 (depends on the installation distribution, of course), and replace your existing one with it.

6. I assume your installation media is actually NFS, so your boot command should be something like: linux dd=nfs:NAME_OF_SERVER:/path/to/NFS/Directory

Should and would work like charm. Notice you need to use the 64bit kernel with the 64bit driver, and same for the 32bit. Won’t work otherwise, of course.

After you’ve finished the installation, *before the reboot*, press Ctrl+Alt+F2 to switch to text console, and do the following:

1. Copy your kernel RPM to the new system /root directory: cp /mnt/source/prepared_dir/kernel….rpm /mnt/sysimage/root/

2. Do the same for HP drivers RPM

3. Chroot into the new system: chroot /mnt/sysimage

4. Install (with –force if required, but *never* try it first) the RPMs you’ve put in /root. First the kernel and then HP driver.

5. HP Driver RPM will fail the post install. It’s OK. rename /boot/initrd-2.6.9-11.ELsmp (or non SMP, depends on your installed kernel)

6. Verify you have alias for the new storage device in your /etc/modprobe.conf

7. run mkinitrd /boot/initrd-2.6.9-11.ELsmp 2.6.9-11.ELsmp (or non SMP, depending on your kernel)

8. Edit manually your /etc/grub.conf to your needs.

Note – I do not like Grub. Actually, I find it lacking in many ways, so I install Lilo from the i386 (not the 64bit, since it’s not there) version of the distro. Later on, you can rename /etc/lilo.conf.anaconda to /etc/lilo.conf, and work with it. Don’t forget to run /sbin/lilo after changes to this file.