Archive for May, 2006

HP ML110 G3 and Linux Centos 4.3 / RHEL 4 Update 3

Tuesday, May 30th, 2006

Using the same installation server as before, my laptop, I was able to install Linux Centos 4.3, with the addition of HP’s drivers for Adaptec SATA raid controller, on my new HP ML110 G3.

Using just the same method as before, when I’ve installed Centos 4.3 on IBM x306, but with HP drivers, I was able to do the job easily.

To remind you the process of preparing the setup:

(A note – When I say "replace it with it" I always recommend you keep the older one aside for rainy days)

1. Obtain the floppy image of the drivers, and put it somewhere accessible, such as some easily accessible NFS share.

2. Obtain the PXE image of the kernel of Centos4.1 or RHEL 4 Update 1, and replace your PXE kernel with it (downgrade it)

3. Prepare the driver’s RPM and Centos 4.1 / RHEL 4 Update 1 kernel RPM handy on your NFS share.

4. Do the same for the PXE initrd.img file.

5. Obtain the /Centos/base/stage2.img file from Centos 4.1 or RHEL 4 Update 1 (depends on the installation distribution, of course), and replace your existing one with it.

6. I assume your installation media is actually NFS, so your boot command should be something like: linux dd=nfs:NAME_OF_SERVER:/path/to/NFS/Directory

Should and would work like charm. Notice you need to use the 64bit kernel with the 64bit driver, and same for the 32bit. Won’t work otherwise, of course.

After you’ve finished the installation, *before the reboot*, press Ctrl+Alt+F2 to switch to text console, and do the following:

1. Copy your kernel RPM to the new system /root directory: cp /mnt/source/prepared_dir/kernel….rpm /mnt/sysimage/root/

2. Do the same for HP drivers RPM

3. Chroot into the new system: chroot /mnt/sysimage

4. Install (with –force if required, but *never* try it first) the RPMs you’ve put in /root. First the kernel and then HP driver.

5. HP Driver RPM will fail the post install. It’s OK. rename /boot/initrd-2.6.9-11.ELsmp (or non SMP, depends on your installed kernel)

6. Verify you have alias for the new storage device in your /etc/modprobe.conf

7. run mkinitrd /boot/initrd-2.6.9-11.ELsmp 2.6.9-11.ELsmp (or non SMP, depending on your kernel)

8. Edit manually your /etc/grub.conf to your needs.

Note – I do not like Grub. Actually, I find it lacking in many ways, so I install Lilo from the i386 (not the 64bit, since it’s not there) version of the distro. Later on, you can rename /etc/lilo.conf.anaconda to /etc/lilo.conf, and work with it. Don’t forget to run /sbin/lilo after changes to this file.

Hard Freeze when Using FireFox, Unison-GTK, and some other GTK apps

Friday, May 26th, 2006

Hard freezes are unpleasant at best. They also prevent you from tracking the source of the problem. You speculate, based on the "familly" of applications you encounter problems with, and try to obtain some resolution.

At first it was FireFox. I’ve removed it to install Galeon instead. Didn’t help. I left it at that, and started using Konqueror, which worked correctly.

I’ve worked correctly for a while, and one day, when opened Unison-GTK to sync my folders with my desktop, another hard freeze.

Based on few assumptions, I’ve started searching for a solution or some workaround. I’ve encountered this link, which led me to believe the problem is GTK+ 1.0 related. This link, however, is rather old, and seems unrelevant to the cause, as this has started only few weeks ago.

Better defined search lead me to find this Ubunto bug description, which suggested I go back to 16bit colors (which I’ve used just up to a month or so ago).

At first glance, the problem is solved. However, this is no more than some workaround, as I am limited currently to 16bit colors, and because I’m stuck with Radeon Mobility M6 LY, which is the main cause of all this. I hope to dump these buggy cards on my next mobile.

My current xserver-xorg-video-ati package version is 6.5.8.0-1 (debian).

Linux IPTables flow

Friday, May 26th, 2006

IPTables can be tricky. The concept of chains pointing to chains pointing to chains can get complicated.

However, understanding the initial flow, the initial "which chain points where", and the general concept which can allow, later, for easier NAT, or DNAT, or even knowing where to put a single rule is important. Especially if you are to utilize your Linux box as a router. Even if not, it better helps knowing how to defent it.

So, here’s an image describing the common relationship between the predefined chains in Linux IPTables.

IPTables default chains relationship

Web server behind a web server

Friday, May 26th, 2006

I’ve acquired a new server which is to supply services to a certain group. On most cases, I would have used PREROUTE chain in my IPTABLES on my router for prerouting, based on a rule such as this:

iptables -t nat -I PREROUTING -i <external_Interface_name> -p tcp -s <Some_IP_address> –dport 80 -j DNAT –to-destination <New_server_internal_IP>:80

I can do this trick to any other port just as well, however, I already have one web server inside my network, and I cannot know the source IP of my special visitors. Tough luck.

Reverting to more application-based solution, I can use my existing Apache server, which listens on port 80 alread, and gets its requests already, with mod_proxy directive and Name based Virtual Hosts.

Assuming the name of the server should be domain.com, and that the DNS entries are correct, I would add such a directive to my vhosts.conf (or whatever other file containing your Apache2 Virtual Servers configuration):

<VirtualHost \*:80>
ServerName domain.com
ErrorLog logs/domain.com-error_log
CustomLog logs/domain.com-access_log common
ProxyRequests Off
<Proxy \*>
Order deny,allow
Allow from all
</Proxy>

ProxyPass / http://<Internal_Server_IP_or_Name>/
ProxyPassReverse / http://<Internal_Server_IP_or_Name>/
</VirtualHost>

I’m not absolutely sure about the need for logs, but I was able to see few issues by using them, such as that the internal server was down, etc. I can see that the internal server is being accessed, and that it’s working just fine.

A note – If it’s the first Name Based Virtual Host you’ve added, you will need to “readjust” your entire configuration to a Name Based Virtual Host. Name agnostic and Name based cannot reside on the same IP configuration. It just won’t work.

Transparently Routing / Proxying Information

Monday, May 15th, 2006

I was required to utilize a transparent proxy. The general idea was to follow a diagram as the one here:

The company did not want any information (http, https, ftp, whatever) to pass directly through the firewall from the internal network to the external network. If we can move it all via some sort of proxy, the general idea says, the added security is well worth it.

Getting an initial configuration to work is rather simple. For port 80 (HTTP), all need not do more than install squid with transparent directives included (can be found here, for example, and on dozens of other web sites), and make sure the router redirects all outbound HTTP traffic to the Linux proxy.

It worked like a charm. Few minor tweeks, and caching was done well.

It didn’t work when it came to other protocols. It appreas Squid cannot transparently redirect (I did not expect it to actually cache the information) SSL requests. The whole idea of SSL is to prevent the possibility of "A-Man-in-the-Middle" attack, so Squid cannot be part of the point-to-point communication, unless directed to do so by the browser, with the CONNECT command. This command can be assigned ONLY if the client is aware of the fact that there is a proxy on the way, aka, configured to use it, which is in contrast to the whole idea of Transparent Proxy.

When it failed, I’ve came up with the next idea – let the Linux machine route onwards the forwarded packets, by acting as a self-sustained NAT server. If it can translate all requests as comming from it, I will be able to redirect all traffic through it. It did not work, and working hard into IPTables chains, and adding logging (iptables -t nat -I PREROUTING -j LOG –log-prefix "PRERouting: ") into it, I’ve discovered that although the PREROUTING chain accepted the packets, they never reached the FORWARD or POSTROUTING chains…

The general conclusion was that the packets were destinated to the Linux machine. The Firewall/Router has redirected all packets to the Linux server not by altering the routing table to point at the Linux server as the next hop, but by altering the destination of the packets themselves. It meant that all redirected packets were to go to the Linux machine.

Why did HTTP succeed in passing the transparent proxy? Because HTTP packets contain the target name (web address) in their data, and not only in their headers. This allows for "Name based shared hosting", and thus the transparent proxy can actually exist.

There is no such luck with other protocols, I’m afraid.

The solution in this case can be achieved via few methods:

1. Use non-transparent proxy. Set the clients to use it via some script, which will enable them to avoid using it when outside the company. Combined with transparent HTTP proxy, it can block unwanted access.

2. Use stateful inspection on any allowed outbound packets, except HTTP, which will be redirected to the proxy server transparently.

3. Set the Linux machine in the direct path outside, as an additional line of defence.

4. If the firewall/Router is capable of it, set a protocol-based routing. If you only route differently packets outbound for some port, you do not rewrite the packet destination.

I tend to chose option 1, as it allows for access to work silently when using HTTP, and prevents unconfigured clients from accessing disallowed ports. Such a set of rules could look something like (the proxy listens on port 80):

1. From *IN* to *LINUX* outbound to port 80, ALLOW

2. From *IN* to *INTERNET* outbound to port 80 REDIRECT to Linux:80

3. From *IN* to *INTERNET* DENY

Clients with defined proxy settings will work just alright. Clients with undefined proxy settings, will not be able to access HTTPS, FTP, etc, but will still be able to browse the regular web.

In all these cases, control over the allowed URLs and destinations is in the hands of the local IT team.

IBM X306 (ServerRaid7e) and Linux Centos 4.3

Sunday, May 14th, 2006

It was no fun, and I hope I will never experiance again such bad setup.

Summery: IBM X306. The X306, X206 and some of the others are equipped with ServerRaid7e Sata controller. These controllers lack Linux drivers, and thus, make me a sad person. Drivers are available from IBM web site to RHEL 3 Update 4 (Similar to Centos 3.4), and on a very rare occasion, for RHEL 4, and RHEL 4 Update 1.

Centos 4.3 is equivalent to RHEL 4 Update 3, so it wasn’t quite that.

Stage zero: Come ready.

I’ve came as ready as one can be. Equipped with a laptop capable of serving bootp requests, NFS images of both the 32bit and the 64bit version of Centos 4.3, I was as ready as I can. Discovered that tfptd alone (Debian version) was not enough (or could not boot IBM’s PXE, in this case), and had to replace it to tftpd-hpa, which worked correctly.

Stage one: Asses the problem. The Problem was assessed, and I’ve understood that no disks were available for installation. The kernel was unable to access local disks. Bad.

Stage two: Find a solution.

With Internet access, I was able to identify drivers in IBM site, but they were for RHEL3 only. Some forum (can’t remember link, sorry) led me to a hidden part of IBM’s web site, which included a driver for RHEL 4 Update 1. I’ve downloaded it.

Stage Three: Work harder.

To load a vendor module available for a specific kernel version, you need to have that specific version. I’ve had Centos 4.3, and required the kernel for Centos 4.1. I’ve visited Centos old download source, and was able to download the kernel version for Centos 4.1 (kernel 2.6.9-11.EL.x86_64.rpm), and the required kernel and initrd image for installation.

I’ve started installation with the following command:

linux dd=nfs:192.168.0.1:/mnt/Source/40k8690.img

After a short boot sequence, an error message appread, claiming I did not install the correct version. Of course I did not, I’ve used an older kernel and initrd!

I’ve downloaded the old stage2.img file from cdroot/Centos/base, and replaced the current one.

There was a problem with IBM’s supplied drivers, or the one obtained from the net, so I’ve looked into IBM’s CDROM, the one supplied with the server, and found the driver there. I will add it here, just in case – both RPM and DriverDisk (dd) image.

Finally I’ve managed to install the server, after rather long time in the server room.

One note – do yourselved a favour and replace the kernel package to the one of Centos 4.1 before you reboot the server at the end of the installation. It can save you both rescue install, and both troubles with RPM.

For some reason, I could not install the alternate (older) kernel. It failed to install because it was older, and failed with error when used with "–force". I’ve had to insert the files manuall. It required some tweeking:

I’ve used "rpm2cpio kernel….rpm | cpio -id" to extract the files, and then moved them to the respective directories. I’ve had to do a similar trick for IBM’s drivers, becuase they failed, for some reason, the post-install script. They have created an entry for the raid controller in /etc/modprobe.conf, and I’ve only had to recreate the initrd file. A command similar to this:

mv /boot/initrd-2.6.9-11.EL.img /boot/initrd-2.6.9-11.EL.img.old

initrd -o /boot/initrd-2.6.9-11.EL.img 2.6.9-11.EL

did the trick. Adding the correct entry to /etc/grub.conf fixed it all up. I was able to boot the newly installed system.

I don’t know what are the implications, but I did not dare letting Kudzu change these settings, when it claimed to have found a raid controller. I just ordered it to ignore it and never ask again.

reason: 550 Requested action not taken: Nonstandard SMTP line terminator.

Tuesday, May 9th, 2006

I have encountered this problem on my own personal mail server only once a while. It’s ratehr rare, and happends only when sending mail to some specific domains.

My first notion was based on the claim that if it works for anywhere except for these one or two domains, the problem is with these one or two.

My second notion was to investigate it somewhat further, to be able to assist the owner of the domains in question in solving their problem, or supply them with links showing possible solutions.

I was very surprised to discover that the problem was here. It was a bug with spamass-milter, used by my server to connect ol’ Sendmail and Spamassassin together, and it appeared only in version 0.3.0.

I’ve just upgraded to version 0.3.1, and it works correctly for these domains.

Case closed.

Ontap Simulator, and some insights about NetApp

Tuesday, May 9th, 2006

First and foremost – the Ontap simulator, a great tool which surely can assist in learning NetApp interface and utilization, lacks in performance. It has some built-in limitations – No FCP, no disks (virtual disks) larger than 1GB (per my trial-and-error. I might find out I was wrong somehow, and put in on this website), and low performance. I’ve got about 300KB/s transfer rate both on iSCSI and on NFS. To make sure it was not due to some network hog hiding somewhere on my net(s), I’ve even tried it from the host of the simulator itself, but to no avail. Low performance. Don’t try to use it as your own home iSCSI Target. Better just use Linux for this purpose, with the drivers obtained from here (It’s one of my next steps into “shared storage(s) for all”).

Another issue – After much reading through NetApp documentation, I’ve reached the following concepts of the product. Please correct me if you see fit:

The older method was to create a volume (vol create) directly from disks. Either using raid_dp or raid4.

The current method is to create aggregations (aggr create) from disks. Each aggregate consists of raid groups. A raid group (rg) can be made up of up to eight physical disks. Each group of disks (an rg) has one or two parity disks, depending on the type of raid (raid 4 uses one parity, and raid_dp uses “double parity”, as its name can suggest).

Actually, I can assume that each aggregation is formatted using the WAFL filesystem, which leads to the conclusion that modern (flex) volumes are logical “chunks” of this whole WAFL layout. In the past, each volume was a separated WAFL formatted unit, and each size change required adding disks.

This separation of the flex volume from the aggregation suggests to me the possibility of multiple-root capable WAFL. It can explain the lack of requirement for a continuous space on the aggregation. This eases the space management, and allows for fast and easy “cloning” of volumes.

I believe that the new “clone” method is based on the WAFL built-in snapshot capabilities. Although WAFL Snapshots are supposed to be space conservatives, they require a guaranteed space on the aggregation prior to committing the clone itself. If the aggregation is too crowded, they will fail with the error message “not enough space”. If there is enough for snapshots, but not enough to guarantee a full clone, you’ll get a message saying “space not guaranteed”.

I see the flex volumes as some combination between filesystem (WAFL) and LVM, living together on the same level.

LUNs on NetApp: iSCSI and/or Fibre LUNs are actually managed as a single (per-LUN) large file contained within a volume. This file has special permissions (I was not able to copy it or modify it while it was online and I had root permissions. However, I am rather new to NetApp technology), and it is being exported as a disk outside. Much like an ISO image (which is a large file containing a whole filesystem layout) these files contain a whole disk layout, including partition tables, LVM headers, etc – just like a real disk.

Thinking about it, it’s neither impossible nor very surprising. A disk is no more than a container of data, of blocks, and if you can utilize the required communication protocol used for accessing it and managing its blocks (aka, the transport layer on which filesystem can access the block data), you can, with just a little translation interface, set up a virtual disk which will behave just like any regular disk.

This brings us to the advantages of NetApp’s WAFL – the ability to minimize I/O while maintaining a set of snapshots for the system – a list of per-block modification history. It means you can “snapshot” your LUN, being physically no more than a file on a WAFL-based volume, and you can go back with your data to a previous date – an hour, a day, a week. Time travel for your data.

There are, unfortunately, some major side effects. If you’ve read the WAFL description from NetApp, my summary will be inaccurate at best. If you haven’t, it will be enough, but still you are most encouraged to read it. The idea is that this filesystem is made out of multi-layers of pointers, and of blocks. A pointer can point to more than one block. When you commit a snapshot, you do not change the pointers, you do not move data, you just modify the set of pointers. When there is any change in the data (meaning a block is changed), the pointer points to the alternate block instead of the previous (historical) block, but keeps reference of the older block’s location. This way, only modified blocks are actually recreated, while any unmodified data remains on the same spot on the physical disk. An additional claim of NetApp is that their WAFL is optimized for the raid4 and raid_dp they use, and utilizes it in a smart manner.

The problem with WAFL, as can be easily seen, is fragmentation. For CIFS and NFS, it does not cause much of a problem, as the system is very capable of read-ahead just to solve this issue. However, A LUN (which is supposed to act as a continuous layout, just like any hard-drive or raid-array in the world and on which various file-system related operations occur) gets fragmented.

Unlike CIFS or NFS, LUN read-ahead is harder to predict, as the client tries to do just the same. Unlike real disks, NetApp LUNs do not behave, performance-wise, like the hard-drive layout any DB or FS has learned to expect and was best optimized for. It means, for my example, that on a DB with lots of small changes, that the DB itself would have tried to commit changes in large write operations, committed every so and so interval, and would thrive to commit them as close to each other, as continuous as possible. On NetApp LUN this will cause fragmentation, and will result in lower write (and later read) performance.

That’s all for today.

NetApp Ontap 7.1

Thursday, May 4th, 2006

I’ve had the pleasure of playing with Ontap Simulator. This is a marvelous tool, designed to simulate a real NetApp appliance, in an easy and affordable manner.

I’ve noticed a link to this simulator, in Oracle’s web site. I’m posting it here so you’ll get to know what I’m talking about. I’m playing with it on a Linux host. I’ve created virtual disks (small ones only. The simulator does not allow larger than 1G disks anyhow), and I’m playing with NFS, CIFS, Snapshots, etc.

If I have some surprising views on the matter, I will share them here.