Archive for March, 2007

High load average due to hardware issues

Friday, March 30th, 2007

Performance tuning is a sort of art. You know what you expect to reach, and you somehow strive towards that through selective tuning. Either your OS memory utilization, your network settings, NFS mount parameters, etc.

I’ve been to a customer who’s server acted funny. First, it had high load average – for an idle server with 2 CPUs, a load average which never gets below 1.0 can be considered high.

Viewing the logs I’ve seen lots of PS/2 error messages. It seems that the hotplug daemon had been very busy at respawning several times a second due to incorrect hardware detection – due to these PS/2 errors, and caused high load average (many processes in the CPU queue). Disconnecting the PS/2 port between the server and the KVM solved the issue, and within around 2 minutes the load average has decreased to around 0.02.

Hardware related problems are, usually, the most intensive and easy to solve performance hogging.

The cutest useless thing I could have wanted

Wednesday, March 28th, 2007

I’ve been browsing a blog of a friend of mine, xslf, when I read this post dealing with this cute wifi rabbit. I have been browsing its website.

I want one. This rabbit just cought me so badly that I even set the icon for the "Not Really Technical" section to be its figure. I hope I don’t break any trademark rule… Still – I do have a link directly to their website…

HP MSA1000 controller failover

Tuesday, March 27th, 2007

HP MSA1000 is an entry-level disk storage capable of communicating via different types of interfaces, such as SCSI and FC, and can allow FC failover. This FC failover, however, is controller failover and not path failover. It means that if the primary controller fails entirely, the backup controller will “kick in”. However, if a multi-path capable client will fail its primary interface, there is no guarantee that communication with the disks through the backup controller.

The symptom I have encountered was that the secondary path, while exposing the disks (while the primary path was down for one of the servers) to the server, did not allow any SCSI I/O operations. This prevented the Linux server’s SCSI layer from accessing the disks. So they did appear when doing “cat /proc/scsi/scsi“, however, they were not detected using, for example, “fdisk -l“, and the system logs got filled with “SCSI Error” messages.

About a month ago, after almost two years, a new firmware update has been released (can be found here). Two versions exist – Active/Passive and Active/Active.

I have upgraded the MSA1000 storage device.

After installing the Active/Active firmware upgrade (Notice Linux users – You must have X to run the “msa1500flash” utility), and after power cycling the MSA1000 device, things start to look good.

I have tested performance with a person on-site disconnecting fiber connections on-demand, and it worked great. About 2-5 seconds failover time.

Since this system run Oracle RAC, and it uses OCFS2, I had to update the failed-node timeout to be 31 seconds (per this Oracle’s OCFS site, which includes some really good tips).

So real High Availability can be archived after upgrading MSA1000 firmware.

New design to my blog!

Sunday, March 25th, 2007

Thanks to my charming wife, my blog has been redesigned to be somewhat more appealing. I have noticed that many of the techno-babble blogs or personal websites look bad. Usually – black on white at most. Sometimes, some awful design.

I am proudly not part of *this* group anymore :-)

Compaq Proliant 360/370/380 G1 cpqarray problems with Ubuntu

Saturday, March 24th, 2007

Or, for that matter, any other Linux distribution that:

a. uses kernel 2.6.x up to 2.6.18

b. Does not dynamically create the initrd as part of the installation

Ubuntu, for that matter, is an example of not doing both. While it does create the initrd, it doesn’t create it dynamically per the output of ‘lspci‘, which results in inclusion of every SCSI module which exists.

The symptoms – you can install the system, however, you are unable to boot it afterwards. You might get into your Busybox initrd. The cpqarray module doesn’t detect any arrays. Error is "cpqarray: error sending ID controller" . You will notice that the module sym53c8xx is loaded.

I’ve searched for a solution and found an initial hint in this blog, however, the entry was not completely accurate. Following the tips given in this page, I was able to understand that there was a bug in the kernel which caused sym53c8xx modules to take-over the cpqarray. I was required to remove the modules from the initrd. I booted into rescue mode from the Ubuntu Server CD, and from there did the following:

1. mount /boot

2. add the following modules list to your /etc/initramfs-tools/modules – modules-proliantG1.txt

3. Edit /etc/initramfs-tools/initramfs.conf to change "MODULES=most" to "MODULES=list"

4. Run "update-initramfs -k 2.6.17-11-server -c" (this is relevant in my case – up-to-date Ubuntu server 6.10. For other versions, check what is the latest version of installed kernel. This can be found by a mere ls on /lib/modules/)

After reboot I was pleased to discover that my system was able to boot correctly, and I know it will do so for updated versions of the kernel

Finally had some time today

Thursday, March 22nd, 2007

So this post is not technical by nature.

Today I gave away 14 PCs and 5 VGA screens. All are in some-not-exactly-unworking-condition, which means that you can probably mix two computers into one, or you need only add some RAM, HDD or other several components to make any of these PCs work.

All of them are either Pentium1 or Pentium2 class (I think there was one AMD K6 there).

A picture of the pile before the giveaway:

This is the pile – 14 PCs and 5 VGAs!

So I searched for anyone who was willing to take them, and found one. I was surprised at how quick the responses were. It was less than 5 hours from the time I’ve posted my give-away offer, till the computers were gone.

You can say it’s for a good cause – the person who’s got them tries to make them work and then he installs Xubuntu on them (a very lightweight distro), and give them away (or sell them in a near-zero price) to people with no computer, and usually with zero computer skills. He teaches them to use the computer for the general day-to-day needs WE all are familiar with. This is an honorable task, I must say, and I salute him. Not only adding users to the pool of the modern society, but also doing it for near-zero pay, and actually making them Linux users – plain dumb-I-dont-have-viruses computer users. Can hardly be better.

Bash – Handeling children and termination signals

Wednesday, March 21st, 2007

First and unrelated – this is my birthday. It reminds me that another year passed, and generally speaking, I do not take this too well…

Due to massive SPAM attacks, my commenting system is turned off for a while now, and I need to see how I can re-enable it safely.

Bash – here we go.

When you want a single script to spawn several commands in parallel, the best way is to use the ampersand at the end of each command, example:

/usr/bin/find / -name 123 &

/bin/grep -r abc / &

etc.

If you do not want the output from these commands to mix together, you would probably wish to redirect it to a file, for example (redirecting all outputs):

/usr/bin/find / -name 123 &>/tmp/find.out &

/bin/grep -r abc / &>/tmp/grep.out &

You can later “cat” the two files in your own desired order.

This adds two interesting issues – the first is about how you can tell that both commands finished. There are several methods, such as collecting their PIDs, and looping with “sleep” until they are no longer there. Alternate, and more elegant method is by using “wait“. This command will wait for both commands (in our example. As many commands as you have forked to the background) to finish, and only then continue. So we can add, in our example, the following lines:

wait

cat /tmp/find.out

cat /tmp/grep.out

This will insure that both outputs are not mixed together, and are readable.

The second issue caused by the output redirection we’ve added earlier is the handling of killing these commands. Let’s assume that our script is time-limited, and if it exceeds its given time limits, it gets killed. In this case, this script will be killed, however, its children will not die, and will become owned by init, PID 1. This will keep these commands running. Try to assume, for that matter, that every 10 minutes we run the main script, and that it is limited to these ten minutes. We might kill the system’s I/O performance since we might reach a case where several “find” commands are running in parallel – each invoked by our main script at a different time.

To handle such case, we can use the command “trap“. It allows us to handle signals in a method we desire. notice that if you capture SIGTERM (kill -15 – the default kill) and misuse it, the only method of stopping the main script will be by invoking SIGKILL (kill -9) on it, which bypasses all trap directives.

In our example, let’s add this (assume we are aware of each PID)

trap “kill $PID1 $PID2 ; exit 0″ SIGTERM

So we can sum up our example script to be like this:

/usr/bin/find / -name 123 &>/tmp/find.out &

PID1=$!

/bin/grep -r abc / &>/tmp/grep.out &

PID2=$!

trap “kill $PID1 $PID2 ; exit 0″ SIGTERM

wait

cat /tmp/find.out

cat /tmp/grep.out

This wraps it up. Hope it helps.

Bash – Variable indirection – Using variable contents as a(nother) variable name

Tuesday, March 20th, 2007

This was a tricky action. Assume I have a list of variables, obtained by an external source:

var1=a

var2=b

var3=c

I cannot use loop and in it the phrase ${var$i} (where i is the integer counter). It just doesn’t work. I used this instead to assign the values to an array:

var[$i]=$(eval echo "\${var${i}}")

That way, I was able to loop through these values later easily.

So… we can use assigned var names inside a var if we do it right: $(eval echo "\${var${i}}")

My menuscript for a happy Ubuntu 64 bit installation

Monday, March 19th, 2007

I have been extremely busy for the last few weeks and couldn’t find time to update my blog, so apologies are in place. I am sorry, and I will add later this week several tips and tricks about BASH scripting, which might save time and effort for those of you who use if for more complicated tasks.

But alas – This post is about the things need to be done after new installation of Ubuntu Dapper (6.10) x86_64 has just been done, including, of course, links.

First – This is PentiumD 2.8GHz, Duo, on Abit IB9 Mobo (which I wasn’t too impressed with), 2x 320GB Sata2 HDDs, 2GB RAM and NVidia 7100 Dual-Head (I didn’t want ATI due to their limitations with max accelerated resolution, and the limitations it imposed on my Dual-head setup).

Initial installation as follows:

Edgy Server x86_64, created software mirror (raid1) for /boot, 2x 2GB swap spaces (one on each HDD), and LVM2 VG on mirror on the rest of the disk. Created LV for “/” (10GB XFS) and LV for “/home” (30GB Reiserfs).

During installation the Mobo didn’t recognize my IDE CD, and as the quickest remedy I have used USB-to-IDE adapter with additional CDROM which worked just fine.

Post installation I had to fix /etc/fstab to point to the correct (and now working) IDE CDROM.

To install full Ubuntu desktop, I have used “sudo aptitude ubuntu-desktop”. Sound worked out of the box.

Requirements:

- Skype

- Hebrew TTF Fonts

- mplayer

- Beryl (+XGL because of NVidia)

- Flash in Firefox

Skype:

Skype website has allowed me to download the statically compiled Skype package. It didn’t work, of course, since it was 32bit only. I have installed the following additional packages:

ia32-libs ia32-libs-gtk lib32asound2 lib32objc1 linux32 lib32ncurses5 ia32-libs-sdl

Extracted the archived Skype package, moved its contents to /usr/lib/skype and created symlink from /usr/lib/skype/skype to /usr/bin/skype

Hebrew TTF Fonts:

It was a bit more tricky. I had to get these fonts from some Windwos machine. I got them from one of my licensed desktops, and copied them (only .TTF and .ttf) to /usr/share/fonts/truetype/ttf-windows – a directory created for this purpose. I have then created a symlink for every ttf file in this directory to /var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType , which gets included in the default xorg.conf. After restarting X, it worked like a charm.

mplayer:

I have installed mplayer using the restricted and multiverse repositories. I was surprised when I was able to play movies out of the box. Maybe my common codecs are just enough… I will look into it later.

XGL:

I have installed the latest NVidia driver for amd64 from NVidia’s site, and configured Dual-Head setup per my already-existing-too-messy xorg.conf file. xorg.conf.nvidia

Followed the Beryl Wiki for Ubuntu, by the letter. Mind you – I was aiming at XGL with Gnome.

I was so delighted when it turned out to work with my Dual-Head at a total resolution of 2560×1024

Flash in Firefox:

That was to trickier one. I managed to find this guide in Ubuntu forums which was more than enough for me. I did not notice on the first attempt, however, that there are two RPM packages required, and thus failed the procedure. When I have noted it, I was able to complete the task flawlessly.

So, now I have a completely working system, per my needs and requirements. I’m very happy, and I hope I gave good pointers to others who want to use their new 64bit system in a normal manner, even when some vendors do not supply 64bit compatible binary software.

Remember the power of the open source – If it is required to work under 64bit environment it wil be ported to one, while commercial software companies tend to fall behind with new, and sometimes not too popular, propriety systems.