Archive for July, 2005

Finished customer’s project

Sunday, July 31st, 2005

It was long, it was tiresome, and it was nasty. We’ve been to a hosting farm, in one of Israel’s largest ISPs,where their (and our) customer needed to relocate servers, and change his server’s IPs, settings, etc.

I don’t know why, but we’ve tried to come as prepared as possible. One of the things you learn, doing such
projects in an un-controlled environment, far away from your own personal lab, is this – "Trust no one". Just like X-Files, but for real.

If it’s not obvious, here’s an example – Assuming you get there, and you find out you need some drivers for one of the machines. In a controlled environment, you would get these drivers from the Internet, but in an uncontrolled environment, you must make sure you get them with you before, and make sure the CD, floppy, USB port, or whatever is being used there, is actually functioning, and in good condition. Not only, you must make sure you either get in this place with a whole pack of methods to get the files/info/drivers/data into the machine in question, or a method of transferring between media types, like cd -> Disk on Key, or DoK -> Floppy.

So, trying to be as prepared as possible for the machine (plus extra ~400 domains) transfer and change, we’ve came with the following inventory:

  • 1 IBM 1U server, preinstalled with Linux, predefined as DNS server, and web server, saying "The server is under maintenance. It will be solved soon" or something alike.
  • 2 Laptops running Linux/Windows, including backup of all configurations of the Virtual servers, and the root servers.
  • Cables
    (We’ve discovered only on last minute we don’t get anything out of the hosting farm. We have to bring it all with us. It was night, and we just picked anything we could for it, hoping it would do. It did).
  • Tools
  • extras
  • Exact written procedure of which files to change, where, and into what. New IPs pre-assigned, passwords, etc.

We were only half prepared. Half prepared, because the only thing we didn’t predict as much was the ill tempered and lazy SoB who was our contact in the farm. I have no idea why, and I do not care why, but he has some grudge with our (and his!) customer, and he made everything he could to "not help us". Meaning he didn’t deliberately hinder us, but he did the least he could to help, up to nothing.

Example? Sure. We needed network link for the new rack, so he said we had one. I’ve asked him to activate it, and soon he claimed he did. Not long after, when reconfigured the router, and moved it into the new location, I needed to connect it to this link. Not working. I started debugging the problem (maybe bad cable, maybe interface in "shut" mode. Maybe we need laplink cable. Don’t know). Soon I had the obvious idea, and asked him if the link was up. He said "No. I was just waiting for you". I’ve asked him to bring the link up, keeping my temper as down as possible. It took him 15-30 minutes, while we just stood and waited (it was a show stopper. You can’t start moving servers before you know you have where to connect them to, right?). Finally, and after lots of intervention on our side (like testing and seeing the link was still down, changing cables, etc), the link was brought up, and we could
continue.

Things like this piss me off. You expect the man to do any and every thing he can to assist, so all of you can go home already (the job started at midnight), and this lazy SoB was supposed to hand us the cable link, everything predefined per our demands, and wait for us to finish. Not starting to set it up during our work, and
"waiting" for us. We had to wait for him, that’s for sure, but he had no reason to wait for us.

So that’s a hostile, and uncontrolled environment.

Don’t get me wrong. We had tons of laughs, and enjoyed the job (and the A/C), but the lack of cooperation, and the stinking attitude of our contact person was, least to say, a problem. Another example is when asking for coffee (to remind you – midnight, no coffee-shops open for kilometers around us), he showed us into their "kitchen", and pointed out how much he was nice, because of the special time and all, else we wouldn’t supposed to use this "kitchen". Man, this is only a cup of coffee, and it’s not yours, nor your mom’s! Stinking attitude.

And we had our share of technical difficulties. The person setting up our client’s servers was, how to say, amature. He predefined the machine’s IP address in around a dozen different locations. Three times in the firewall settings (for each, virtual of otherwise real machine’s IP), twice in each network configuration file (per machine), once for every major service each machine (again, virtual or real) was running, such as sshd Listen address, or FTPD Listen address, httpd Listen address, etc. It was a major hell. Hosted domains zone files were not using CNAME record for a single, one-time-only-defined IP address (which each Vserver had. Only one), but had a full A record for the whole IP address. We had to "sed" them all to the new ones, decrease the TTLs for each domain (again, "sed", or friend), and so on.

It wasn’t easy, but it went rather well, summing it wall up. Why we did it? For the money, of course. And besides, the hosting farm had better A/C than
I have :-)

Well, it sums a night without sleep, filled with work, before I’ve started traveling around, doing all kind of chores I could accumulate around this area of Israel. It went quite well, after all, and I managed to keep my eyes open when driving, which was good, generally speaking.

So, here’s me, back home, about to go to sleep, behind me a very, very long day.

*Addition*

I have managed to take pictures at the place. Attached in Thumbnails. Sorry for the choppy quality, as they were taken using a cell phone camera, and not a real camera.

Front of the rack

Front of the rack, #2

The rear of the rack

The rack was a bit shorter than we’ve expected, so our power cables are to be pressed in, to allow closing the doors. Tomorrow night, we are to add a router into the system, and change the firewall’s settings,
accordingly. Will be fun. Not.

Totaly MRTG

Monday, July 25th, 2005

I’ve played with MRTG a bit further. 

Well, I have a long lasting, and well used MRTG configurations in various locations and on different servers. I’ve decided just few days ago to add monitoring of Apache (httpd) to my MRTG graphs. So I’ve enabled the server-status page, for a limited set of addresses, and added the MRTG stuff. I’ve had to tinker with the perl script a bit, since it returned the same value twice for almost all and every query, which meant lots of lost screen space. I’ve decided to merge some, and with luck, it will probve useful.

I will not publish here the MRTG settings, not the re-edited scripts, not until I test it further, and for the time being. Not at all. If you’ll ask for it, I would be happy to give it away, but not online now, and no screen shots either. I don’t have the patiance to screenshot it, and erase valid information, so I won’t do it now. Sorry.

Customer’s project

Friday, July 22nd, 2005

I’m on the brink of the actual and physical commitment of a project for a customer. moving a banch of Linux servers at their hosting into another physical location, changing the IP address, and making sure everything’s working correctly.

It could have been a pieve of cake, but this machine runs vservers, and it is using some management interface, etc, which demands carefull setup. not only this, but this machine, acting as a hosting server, has DNS A records for every and any virtual host, instead of having CNAME record, which means we’ll have fun.

I’ve just talked to the hosting supplier (the owner of the farm itself), and they are no thrilled. If it were me, the whole transfer setup would have been cut at that point, and I would have moved to another computer farm. It was a lousy service, and it should be paid accordingly.

So it won’t happen this weekend. The farm is not pre-ready, as one person there said (and he is now abroad, so there’s no one to prove I’m right about that). It will have to be next week. Damn. I was hoping to go on a short diving cruse next week.

Well, it’s just me ranting about. At least a friend’s supposed to come over, have an (exellent) Humus with me, and help me plan the transfer. Now I can get some sleep (look at the time!!! So early in the morning!)

AMR Modem, failure so far

Thursday, July 21st, 2005

As my quest for a fully working laptop is advancing, I have decided to invest my time (or got drawn to it, donno why) in making the modem work. Actually, I’m stuck here. I cannot see the modem’s device in lspci, I have not isapnp devices whatsoever, so I cannot claime it’s an ISA device, and the AMR is supposed to depend on the PCI bus, am I right?

The modem is supposed to be a Lucent AMR modem. According to linmodems, this modem is not supported whatsoever, and never will be. According to some other parts of the internet, this modem might be supported via Smartlink module, which I was able to compile. Well, as we sometimes learn, compiling isn’t everything. The module failed to load because it claimed to have some unresolved symbols. Great. Searching google, I’ve found a web site dealing with activating the modem on an ALSA based systems (sounds just like my case!) so I tried that direction as well. It was well documented withing the driver’s README itself. I’ve discovered one tweak, though. "make install" rebuilds the whole code. It’s like "make all" with the install part afterwards, so if you’re into it, make sure you read the damn README file, and type "make install SUPPORT_ALSA=1" right from the start. No luck, though, as there is no extra ALSA device, not after this procedure, and not after I rebuild the modules, per the instructions in the README file.

Just now I’ve tested an alternate method, as described in this website. It didn’t work either, but was another try, right?

Still, no PCI id for the modem. I might be searching in the wrong direction. Maybe it’s the ISA Bridge, and I’ll have to pinpoint it there. I can’t tell now. I will get back to it sometimes later.

Fujitsu special keys

Wednesday, July 20th, 2005

Last night, after I finished writing the previous entry, I was looking for Fujitsu & Linux related sites, for somewhat more information about my modem.

I found out a site, which reminded me of the Fujitsu special keys which I’ve never used nor managed to setup. Well, the version the site pointed at was old, and I could not compile it, however, the newer version (courtasy of google) of fjkeys and apanel allowed me to finally, for the first time, to comile and insert this module. I was so surprised to find out that there was one led of which I never knew.

Nothing is perfect, of course. I have tons of message such as this:

i2c_adapter i2c-0: Error: command never completed

but besides that (which does not damage nor slow down my suspend/resume), all seems to work. Good. This blog existance (no readers so far, and still no search engine index) pushed me towards solving some of my problems, and towards a better laptop-wise life.

Another set of reasons are that it’s damn hot in the house (but in the balcony it’s cooler, and there’s some wind), and that I have a new set of two batteries (Extended primary battery, and the modular bay battery) which allow me up to 8-9 hours of work. Under high loads, this time tends to get shorter, but not less than 5 hours or so. Cool.

Software Suspend 2, a success story

Wednesday, July 20th, 2005

Owning a laptop, you try to get the best out of it. you want it to be the strongest it can, you want it to be fast, reliable, useful, and cunning. My Fujitsu is cunning, I can say. Not a day passes without me hearing someone saying "Wow, it is so small". I didn’t get it for the audiance, though, I got it because traveling to end users and customer sites could be frustrating when you have no connectivity, and you cannot relay on the customers environment. I once, few years ago, had the paradox I call "The Ez paradox", where I got to a customer who just arranged a brand new ADSL line, had a network card with me, but I forgot the drivers. There was no way I could connect using her computer, nor did I have my laptop then. It consumed few extra hours just finding someone around who could allow me to use his computer to download the NIC drivers. Never again, I swore then, and I was proven to be right. There was one time with another customer, where I proved the screen to be the problem, and not the computer’s VGA card, using my laptop, and another where I could remove the blame from a poor and unjustly accused computer, because I could not connect to the internet using the same ADSL line from my own laptop, etc. It’s a usefull machine, and it saves my time.

One of the things one expects is to open his laptop’s cover, and "Whoosh!", get an up-and-running system right then. As we all know computers, this is not the case. Running Linux on the laptop, it was even worse. Up till a year ago, Software Suspend solution for a pooched ACPI enabled laptops, with no hardware suspend built-in, were, how to say, poor at best. I remember being able to suspend using some SWSUSP beta version, and it was able to suspend and resume about 2/3 of the times I tried to, took five minutes, and not always when it didn’t manage to suspend I was attentative enough to notice it. The poor laptop got to remain up and running (and when swsusp got hang, it consumed 100% CPU) inside my bag few times, until I discovered it half an hour or so later. Not good. Not only this, but this beta version worked sometimes, while its following release (or RC) version failed to suspend or resume or both completely. It was far from being perfect. However, I am full of respect to the people who made it, fighting uncommon hardware setups, which you can find only in laptops, and made a mature and working
product.

Mature, becuase it works, not becuase it’s the most trivial thing to make it work.

I can clearly say that SWSUSP2, in version 2.1.9, for kernel 2.6.11.11 is remarkebly fast, clear, and well working. Using the new UserUI (Nice splash screen with animation, showing during the suspend/resume operation), i can be the envy of my peers, if only they cared.

For anyone on the net asking about it, i can describe shortly how and what I did to make it work.

First, follow the instructions in Software Suspend web site. They know what they are saying, although their site is not always organized (it’s getting better, with the Wiki! Keep the good work!).

To Explain in short words what I have done:

  1. Get and untar/bzip2 kernel version 2.6.11.11 (linux-2.6.11.11)
  2. Get software-suspend-2.1.9-for-2.6.11
  3. Patch the new kernel by using the new method swsusp2 supply (RTFM)
  4. Patch the new kernel with your newly downloaded fbsplash-0.9.2-2.6.11 patch
  5. Get, unzip and compile userui package (note – You will need to edit fbanim/userui_fbanim_core.c and change #include <linux/fb.h> into #include "linux/fb.h" else it won’t compile
  6. You would want to make your own kernel now. It’s going to take a while. You can borrow from my own config (config-2.6.11.11.txt) file.
  7. Anything further can be found easilly in Software Suspend Wiki
  8. Make sure your initrd (if you use it) is set up correctly

It should work correctly at this point. You would like to pick a nice graphics, and I hope, with the help of my wife, to arrange myself some uniqe and eye catching image. Why? Because. It doesn’t cost my anything further, and it will sure attract attention to the system.

Orinoco Solution (so far)

Tuesday, July 19th, 2005

Well, my Orinoco 802.11b mini-pci card had this weird thing – sometimes it failed to start up after suspend. Why? I don’t know. I wasn’t able to understand the cause of it, and not being a coder, I had some hard time tracking kernel driver problems.

The error message was the following one:

orinoco_lock() called with hw_unavailable (dev=c1256800)

eth1: Error -110 setting multicast list.

Google is my friend. It can be yours as well.

I searched for Orinoco_pci modules, and was able to come up with this:

http://www.nongnu.org/orinoco/

Ok, so I can understand my driver version is 0.13e, which comes with the Kernel, but this version didn’t seem to do the work quite right, so I had to test a newer version, way newer. I downloaded the external module pack from here (direct link to the tar.gz I’ve downloaded) and tried to compile it. I failed. Badly. I was not able to understand the cause of the failure. More experianced coder would have solved it by now, but I was not able to. I got this, in general:

orinoco_pci.c:330: error: too many arguments to function `pci_save_state’

orinoco_pci.c:347: error: too many arguments to function `pci_restore_state’

I persumed it was right – the function got one argument too many, so I tried to find another example of this function by running grep "pci_save_state" and "pci_restore_state" /usr/src/linux

I got few files, all using only one argument, So I had to change the file orinoco_pci.c in two places:

pci_save_state(pdev, card->pci_state); on line 330

pci_restore_state(pdev, card->pci_state); on line 347

Based on the examples seen in the kernel tree itself, I’ve changed these two lines to the following:

pci_save_state(pdev);

pci_restore_state(pdev);

I was surprised it managed to compile correctly (tons of warnings, but no errors). I was able to manually load the module for testing, and after careful backup of the original modules, I replaced them with the newer ones.

Tested since yesterday, and so far It seems like my other wireless card, the PCMCIA card, has become a waste of money. I guess I can still use it to scan for networks…

Since there was no place in hell I was able to find this piece of information (me not being a coder), I am putting this online, for all who ever need it to watch and see.

Cheers.

Ez

Laptop Fujitsu P2120 Summary

Tuesday, July 19th, 2005

Well, I have a laptop for the last 2 years or so, which can be noted for its very light-weight and enhanced setup (it comes with full equipment – DVD-CDR combo, 802.11b, 1xPCMCIA, 2xUSB2.0, 1xFireWire, S-Video, VGA Out, Lan, Modem, etc), for only 3.2 lbs, or ~1.5Kg. However it has its disadvantages – the weak CPU (Transmeta 933MHz, which acts worse than P3 500MHz), the (very) slow HDD (40GB, but slow), and the very small VGA LCD – 10.2". Moreover, its Linux competability is not sky high, so I had to work hard to make things work correctly.

So far I have a working setup where the following list works:

1) ACPI Events*

2) PCMCIA

3) Software Suspend 2 (Aka, Hibernation, or ACPI State 4)*

4) USB, and USB devices

5) CDR/DVD

6) Floppy (as a USB Storage device)

7) VGA, and X finally stopped crashing by itself.

8) LAN – RTL8139 based, using ifplugd for detection of links, and using DHCP to obtain IP address then

9) Wireless Card, Orinoco_pci module*

The devices marked with (*) are those which I have problem with. Outside of this list, I have my FireWire untested, and I cannot activate my Modem (LT, if I’m not mistaken).

Current problems:

ACPId fails to show in the logs (/var/log/acpid) the transfer phase from AC to battery mode, and vice versa. When on AC, It says nothing. When on Bat, it prints once around every 10 minutes or so a notice about using battery mode, or that the battery is being discharged. However, on /proc/acpi/battery/CMB*/state I can see when the battery is being charged and discharged. Weird.

Software Suspend 2 has just been patched on a 2.6.11.11 kernel (I was not able to boot the 2.6.12.2 at the moment, but I will devote time for it in the near future), and is being tested as we speak. So far, it has worked remarkably well, joined with the Fbsplash, it shows a nice animation during boot time. One thing, though, which is related to Orinoco_pci module – If the Orinoco driver has no wireless network accessible, it will hang during the power-restore session, fill my logs with the comment "hermes @ MEM 0xcfc74000: Error -16 issuing command." and will not be able to function, on most cases until I hibernate/resume the computer. Even then, not always will it work. It drove me nuts, as it both abused my CPU and thus battery during battery usage, and it required I find some alternate solution. Finally I have a method of adding PCMCIA wireless card, using NDISwrapper to activate it, and removing the orinoco related modules.

Linux – Debian which was based on Knoppix, and was vastly tweaked since. Unstable, and up-to-date. My own custom kernel, and up until now, using vanilla kernel with the swsusp2 patch. Nothing fancy.

First message

Tuesday, July 19th, 2005

Well, It’s been a while since I’ve considered playing with a blog of my own. I’ve never quite found the convicting reason which will pull me out of my chair, and not-a-single-thing-doing-for-a-whole-afternoon-while-browsing-the-net into the active part of installing my own blog.

Well, I did just ten minutes ago.

Why? Because during my tech adventures (as much as they might seem adventurues to anyone), I get to complete tasks, or do things which I have nowhere to add or update, and thus I don’t get to keep, neither to myself for later refferance, nor for others who might bump into the same problems I have.

Who am I?

Keeping this blog blogish enough, there is no point in mentioning my name. You can call me Ez-Aton, or Ez, for short, if you feel like calling me at all. It’s not that I hide, but there is a point in a person’s life where he wants to use the little anonymity the web offers. The little of it which still exists nowdays, anyhow.

Well, I act as a Unix SysAdmin, linux hobbist and SysAdmin, Windows SysAdmin, some experiance with mac, etc. I manage few dozens of *nix machines @ work, namely Linux, Solaris, HP-UX, AIX, and somewhat Windows on a more complex environment. I don’t claim to be an expert on it, oh now, but I claim to know some on everything. And Google being my friend, I can manage my way around the more common obstacles. I learn fast, I almost never do the same mistake twice (unless it’s on purpose, to gain something), so I manage to get, uneducated (no degree, no official courses), a very complex set of systems up and running. not perfect, but how many people you know who can make such things (as you might find in my blog at other times) up and running?

Oh, and I live in Israel, which is a small country, but one can get used to it.

So I hope you find info you need here, or at least enjoy wasting few more minutes of your life, where time flies in front of the computer, browsing and searching, and doing only little. Yep, techie’s life.

Cheers!

Ez.