Archive for June, 2007

Network Bridge

Wednesday, June 27th, 2007

Unlike the expected header, this is not about silently routing packets between interfaces, or bridging multiple networks. This is all about how ants, which find the summer the best time to start investigating our place, can show innovativeness, and can prove that even ants can use network bridges, when required.

The cable was in-air, not connected to wall along the line.

You can see a close-up of how ants actually use Cat5e cables for their own benefit

Resting on the ‘5′

Painful upgrade from Edgy x86_64 to Feisty x86_64

Sunday, June 24th, 2007

If it works, don’t touch it. This is one of my mottoes. I have broken this rule just yesterday when I decided that I was too lazy to install Pidgin from source, and decided I wanted it to be installed directly from deb. Unfortunately, there was no pidgin deb for Edgy. None that I was able to find.

My computer has been suggesting to be upgraded for a while now – ever since Feisty was available. I was cautious and avoided upgrading up until now. I have already installed Feisty on my laptop, on one of my servers (installed Edgy and then upgraded to Feisty with no special events), so I was somehow more at ease. This was, of course, a complete disaster.

Upgrading Edgy to Feisty went OK. Nothing really special, no external sources, nothing. After upgrade, the system failed to reboot – just hung there. It appears (and I have yet to post a bug) that my IT8212 IDE controller (which is connected to my CDROM) hangs the computer.

Not only that, but even when disabled, it appears that Feisty’s kernel has an issue with sata_iix. The issue was solved using post #59 from this bug report. Do not follow, though, this recommendation (all_generic_ide) as you will experience a noticeable performance hit.

I was able to boot my system. No CDROM, but working. I have installed NVidia drivers manually, as the restricted modules were too old. I was required to remove the nvidia entries in /etc/modprobe.d/lrm-video (probably because I’ve installed restricted modules and later on removed it). Had X running, but didn’t have Beryl working. Past experience taught me that AIGLX or direct NVidia DRI are slower than XGL. Attempting to use XGL, I get the white-screen-of-death. Following this guide, I was able to setup XGL correctly, as it seems. It did not solve my white-screen-of-death, however, using –use-copy flag things worked, and seemed to be responding fast enough.

Still have to open a bug about the IT8212 device. Hope for the best.

Misconfigured Amavisd and its impact

Tuesday, June 19th, 2007

As an administrator, I am responsible for many setups and configurations, sometimes hand tailored to supply an answer to a set of given demands.

As a human, I err, and the common method of verifying that you have avoided error is by answering this simple rule: “Does it work after these changes?”

In the world of computers there is hardly ever simple true or false. We would have expected it to be boolean world – either it works or it doesn’t, but we are not there. The world of computers is filled with “works better” and “works worse”, and sometimes we forget that.

This long prologue was meant to bring up the subject of monitoring and evaluating your actions. While the simplest method of evaluation remains “Does it work?”, there are some additional, more subtle methods of verifying that things work according to your specifications.

One of the tools which helps me see, in the mirror of time, the effect of changes I have done is a graphical tool called Cacti. This tool graphs a set of predefined parameters which were chosen by me. It has no special AI, it cannot guess anything, and I am quite happy with it, as I can understand for myself the course of events better.

This post is about a mis configured Amavisd daemon. Amavis is a wrapper which scans using both Spamassassin and a selected Antivirus (ClamAV, in my case, as it has proven itself to me as a good AV) mail supplied by the local MTA.

I had a directive looking like this in it:

['ClamAV-clamscan', 'clamscan',
"--stdout --disable-summary -r --tempdir=$TEMPBASE {}", [0], [1],
qr/^.*?: (?!Infected Archive)(.*) FOUND$/ ],

It worked, however, this server, as it appears, was heavily loaded for a while now. Since it’s a rather strong server, it was not really visible unless you take a look at the server’s Cacti. On about 80%+ of the time the CPUs were on 100% with the process ‘clamscan‘. I have decided yesterday to solve the heavy load, and for that modified the file ‘/etc/amavisd.conf‘ to include the primary ClamAV section as follows:

['ClamAV-clamd',
\&ask_daemon, ["CONTSCAN {}\n", "/tmp/clamd"],
qr/\bOK$/, qr/\bFOUND$/,
qr/^.*?: (?!Infected Archive)(.*) FOUND$/ ],

This uses clamd instead of clamscan. The results were a drastic decrease on the CPU consumption and system average load, as can be seen in the Cacti graph (around 4 AM):

Cacti load average graph

The point is that while both configuration worked, I had the tools to understand that the earlier configuration was not good enough. Through tracking parameters on the system for a while, I could monitor my configuration modifications using a wider perspective, and reach better conclusions.

The first biological portable computer

Tuesday, June 19th, 2007

This is not exactly a technical post, but I had to bring it online.

I am proud to be one of the first persons, if not actually the first one to own a biological portable computer (BPC). You will find no other such thing, I think. I have searched Google, after all.

Although the docking station, or Biologic Electronic Interface (BEI) looks quite similar to the IBM X40’s docking station

The docking station, or Biological Electronic Interface (BEI)

You can see the difference. Unfortunately, in this picture you cannot clearly see the micro conductors which are used in the BEI plug, which is, actually, the method of connecting a simple and regular USB mouse to the BPC.

The BPC has the ability to self support. It is self propelled, and will walk(!!!) back to the BEI whenever the need arises. It has the computational power of hundreds of normal PCs, and although it runs its own unique OS, it has a simple interface which accepts commands. In the picture below, you can see the BPC in its docking station, charging.

The BPC inside its docking

As said, accepts commands, but only seldom performs them. It’s a prototype, and yet has a way to go. It has to fit the docking better (this prototype BEI has been developed as a case study), and should go through more modifications until it can be sold commercially. Yet, very impressive.

RHEL3 Kickstart on Itanium (IA64)

Saturday, June 16th, 2007

Recently I have installed several Redhat systems on IA64 platforms. Since it required only slight adjustments, and since there were two sets of systems, RHEL3 Update2 and RHEL4 Update3, I have decided to use Kickstart for both, each with his own ks.cfg file.

For lack of any other explanation at the moment, I can only say I feel I have encountered a bug with RHEL3 on IA64 platform and ks handling.

Steps:

1. Bring up a dedicated installation server. Install on it DHCP Server, Name Server, TFTP Service (activated from xinetd), NFS Service.

2. Setup DHCP for a dedicated network card. Address pool 192.168.0.x. Server IP: 192.168.0.1

3. Verify it’s working.

4. Extract RH images to the NFS root directory, under the distro name. Example – /install/rhel3.2-ia64

5. Add elilo PXE image for IA64 in /tftpboot. Add a file elilo.conf (elilo.conf)

6. Install both servers – RHEL3 and RHEL4

7. Take anaconda-ks.cfg and use it (with slight modifications) to fit my needs. Really minor changes.

8. Boot the next nodes based on these ks files. (RHEL3 ks file: ks.cfg)

While RHEL4 works fine and uses my ks.cfg, RHEL3 does not. It seems to start using it, and then go on to asking me all these annoying questions (Welcome to RedHat 3 installation!)

I have even tried building ks.cfg using redhat-config-kickstart tool, but same results.

Since installation is done using serial console, I cannot access other virtual consoles and debug the problem on-the-fly.

***UPDATE***

Per a suggestion in a forum, I have looked again into the elilo.conf file, and noticed that the ks path was different. Matter of paying attention. This is probably the problem, and I will verify it soon.

VMware Fencing in RedHat Cluster 5 (RHCS5)

Thursday, June 14th, 2007

Cluster fencing – Unlike many common thoughts, high-availability is not the highest priority of an high-availability cluster, but only the 2nd one. The highest priority of an high-availability cluster is maintenance of data integrity by prevention of multiple concurrent access of nodes to the shared disk.

On different cluster, depending on the vendor, this can be achieved by different methods, either by prevention of access based on the status of the cluster (for example – Microsoft Cluster, which will not allow access to the disks without cluster management and coordination), by panicking the node in question (Oracle RAC, for example, or IBM HACMP), or by preventing failover unless the status of the other node, as well as all heartbeat links were ok up to the exact moment of failure (VCS, for example).

Another method is based on a fence, or “Shoot the Other Node in the Head”. This “fence” is usually based on an hardware device which has no dependencies for the node’s OS, and is capable of shutting it down, many times brutally, upon request. A good fencing device can be a UPS, which supports the other node. The whole idea is that in a case of uncertainty, either one of the nodes can attempt to ‘kill’ the other node, independently of any connectivity issue one of them might experience. This race result is quite obvious: one node remains alive, capable of taking over the resource groups, the other node is off, unable to access the disk in an uncontrolled manner.

Linux-based clusters will not force you to use fencing of any sort, however, for a production environments, setups without any fencing device will be unsupported, as the cluster cannot handle cases of split-brain or uncertainty. These hardware devices, which can be, as said before, a manageable UPS, a remote-control power-switch, the server’s own IPMI (or any other independent system such as HP ILO, IBM HMC, etc), and even the fiber switch – as long as it can prevent the node in question from accessing the disks, are quite expensive, but comparing to hours of restore-from-backup, they sure justify their price.

On many sites there is a demand for a “test” setup which will be as similar to the production setup as possible. This test setup can be used to test upgrades, configuration changes, etc. Using fencing in this environment is important, for two reasons:

1. Simulation of the production system behavior is achieved with as similar setup as possible, and fencing takes an important part in the cluster and its logic.

2. A replicated production environment contain data which might have some importance, and if not that, at least re-replicating it from the production environment after a case of uncontrolled access to the disk by a faulty node (and this test cluster is in a higher risk, as defined by its role), or restoring from tapes is unpleasant and time consuming.

So we agree that the test cluster should have some sort of fencing device, even if not similar to production’s one, for the sake of the cluster logic.

On some sites, there is a demand for more than one test environment. Both setups – a single test environment and multiple test environments can be defined to work as guests on a virtual server. Virtualization assists in saving hardware (and power, and cooling) costs, and allows for easy duplication and replication, so this is a case where it is ideal for the task. This said, it brings up a problem – fencing a virtual server has implications – we can kill all guest systems in one go. We wouldn’t want that to happen. Lucky for us, RedHat Cluster has a fencing device for VMware, which, although not recommended in a production environment, will suffice for a test environment. These are the steps required to setup one such VMware fencing device in RHCS5:

1. Download the latest CVS fence_vmware from here. You can use this direct link (use with “save target as”). Save it in your /sbin directory under the name fence_vmware, and give it execution permissions.

2. Edit fence_vmware. In line 249 change the string “port” to “vmname”.

3. Install VMware Perl API on both cluster nodes. You will need to have gcc and openssl-devel installed on your system to be able to do so.

4. Change your fencing based on this example:

<?xml version="1.0"?>
<cluster alias="Gfs-test" config_version="39" name="Gfs-test">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="cent2" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="man2"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="cent1" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="man1"/>
                                </method>
                                <method name="2">
                                        <device domain="22 " name="11 "/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_vmware" name="man2"
                          ipaddr="192.168.88.1" login="user" passwd="password"
                          vmname="c:\vmware\virt2\rhel5.vmx"/>
                <fencedevice agent="fence_vmware" name="man1"
                          ipaddr="192.168.88.1" login="user" passwd="password"
                          vmname="c:\vmware\virt1\rhel5.vmx"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources>
                        <fs device="/dev/sda" force_fsck="0" force_unmount="0"
				fsid="5" fstype="ext3" mountpoint="/data"
                                name="sda" options="" self_fence="0"/>
                </resources>
                <service autostart="1" name="smartd">
                        <ip address="192.168.88.201" monitor_link="1"/>
                </service>
                <service autostart="1" name="disk1">
                        <fs ref="sda"/>
                </service>
        </rm>
</cluster>

Change to your relevant VMware username and password.

If you have a Centos system, you will be required to perform these three steps:

1. ‘ln -s /usr/sbin/cman_tool /sbin/cman_tool

2. ‘cp /etc/redhat-release /etc/redhat-release.orig

3. ‘echo “Red Hat Enterprise Linux Server release 5 (Tikanga)” > /etc/redhat-release

This should do the trick. Good luck, and thanks again to Yoni who brought and fought the configuration steps.

***UPDATE***

Per comments (and a bit-late – common logic) I have broken lines in the XML quote for cluster.conf. In cases these line breaks might break something in RedHat Cluster, I have added the original xml file here: cluster.conf

More on the Nabaztag/tag

Wednesday, June 13th, 2007

Actually, this post has become less of the non-technical type and more of the technical type, however, for the sake of the cute little Nabaztag (you can send me messages too! Go here and send a message to “fatutchi”!), I keep it still in this category as well.

Today is a busy day, so I’ll have several posts.

This one will deal with the Nabaztag/tag. I have extended the PHP form/script offered in my previous post to allow for multiple Nabaztags selection. Also, added reading the ears status, and parsing the XML returned by the Nabaztag API site.

This is an ugly script, but it works. As said before – if you see fit to extend it or add features, please do so. Attached here: nabiV2.php.txt

I have noticed Violet had several issues with their site. I must confess that I have expected more from their site. As I’ve been involved as a consultant in several large-scale setups which sustained several tenth of thousands (and more) of connections per second, I know that, usually, the main performance hog is caused by an inefficient application design. It could be that Violet’s problems might just point at a low quality server-side software. Pity. I hope it will get better.

I have a Nabaztag/tag

Tuesday, June 12th, 2007

I have received my Nabaztag/tag just a day ago, and it is a cute little thingie.

At-ten-tion!

What can it do?

Actually, not much. It is a wireless device (client) which access Violet’s Nabaztag servers to get its commands. You, theoretically, cannot hijack the session and use it directly over LAN, but you must go through the Internet. This leads to delays in assigning commands.

It can move its ears (surprisingly, very quietly), it can play sound, either by text-to-voice (probably happens on the server-side) or streaming MP3 (cannot, as far as I’ve noticed, play MMS directly). It can also report its ears positions.

I can fly!

As you probably know, the more important thing is not about what it can do, but about how we can utilize it. Violet has added a list of RSS sources for the Nabaztag to read aloud. Through server-side sub processes, it can tell the time (usually at full hours), it can act as a wakeup clock (doesn’t do its job for me – not enough to wake me up), etc. It can probably take part in games based on the location of the ears (for example, if you agree, move the right ear down, etc).

You can check wikipedia for its entry. They cover most of it, maybe except for how cute it is, and it is.

If it were to end at this, I would have been quite frustrated, especially with the device’s price. However, Violet has exported an API which allows me (and you, and him, and everyone!) to send commands (unlimited by the number, as it seems) to the Nabaztag – Say this, move your ears to this position, etc. It allows me to send a choreography, aka a dance, to the device, and it will perform it based on the timing set by the sender.

I have wasted some of my day yesterday to write down a simple (and quite ugly, if you ask me) form which will use the API for simple commands. In my TODO list there is to implement the whole choreography thing, and make it easy. I would like to build this interface as a base for possible other utilizations, such as community games, etc.

I have uploaded my API using PHP form here which is free for use (of course) and everyone is encouraged to use it and/or modify it, as long as you give me my credits :-) . I’m not sure about its security yet… nabi.php.txt

It’s a raw thing, but it works. Don’t forget to:

1. Activate your API interface in http://my.nabaztag.com

2. Change the parameters of your SN and your TOKEN

3. Place the script on PHP enabled site.

Enjoy!

Linux LVM performace measurement

Sunday, June 10th, 2007

Modern Linux LVM offers great abilities to maintain snapshots of existing logical volumes. Unlike NetApp “Write Anywhere File Layout” (WAFL), Linux LVM uses “Copy-on-Write” (COW) to allow snapshots. The process, in general, can be described in this pdf document.

I have issues several small tests, just to get real-life estimations of what is the actual performance impact such COW method can cause.

Server details:

1. CPU: 2x Xion 2.8GHz

2. Disks: /dev/sda – system disk. Did not touch it; /dev/sdb – used for the LVM; /dev/sdc – used for the LVM

3. Mount: LV is mounted (and remains mounted) on /vmware

Results:

1. No snapshot, Using VG on /dev/sdb only:

# time dd if=/dev/zero of=/vmware/test.2GB bs=1M count=2048
2048+0 records in
2048+0 records out

real 0m16.088s
user 0m0.009s
sys 0m8.756s

2. With snapshot on the same disk (/dev/sdb):

# time dd if=/dev/zero of=/vmware/test.2GB bs=1M count=2048
2048+0 records in
2048+0 records out

real 6m5.185s
user 0m0.008s
sys 0m11.754s

3. With snapshot on 2nd disk (/dev/sdc):

# time dd if=/dev/zero of=/vmware/test.2GB bs=1M count=2048
2048+0 records in
2048+0 records out

real 5m17.604s
user 0m0.004s
sys 0m11.265s

4. Same as before, creating a new empty file on the disk:

# time dd if=/dev/zero of=/vmware/test2.2GB bs=1M count=2048
2048+0 records in
2048+0 records out

real 3m24.804s
user 0m0.006s
sys 0m11.907s

5. Removed the snapshot. Created a 3rd file:

net-snmp broken in RHEL (and Centos, of course) – diskio

Saturday, June 9th, 2007

I’ve had a belief for quite a while now that Linux, unlike other types of systems, was unable to produce any I/O SNMP information. I only recently found out that it was partially true – all production-level distros, such as RedHat (and Centos, for that matter) were unable to produce any output for any SNMP DISKIO queries.

I had found a bugzilla entry about it, so I raise the glove in a request to any of the maintainers of an RH-compatible repositories to recompile (and maintain, of course) an alternate net-snmp package which supports diskio.

Meanwhile, I have found this blog post, which offers an alternate (and quite clumsy, yet working) solution to the disk performance measurement issue in Linux. I haven’t tried it yet, but I will, rather soon.

—Update—

I have used the script from the blog post mentioned above, and it works.

Speed could be an issue. Comparing two servers the speed differential was amazing.

Both servers are connected on the same switch as the server running the query is connected. Server1 has a P2 233MHz CPU, while Server2 has a dual 2.8GHz Xion CPU.

~$ time snmpwalk -c COMMUNITY -v2c Server1 1.3.6.1.4.1.2021.13.15 > /dev/null

real 0m0.311s
user 0m0.024s
sys 0m0.020s

~$ time snmpwalk -c COMMUNITY -v2c Server2 1.3.6.1.4.1.2021.13.15 > /dev/null

real 0m8.303s
user 0m0.044s
sys 0m0.012s

Looks like a huge difference. However, I believe it’s currently good enough for me.