Archive for April, 2006

Cacti monitoring of Postfix and Amavisd-new

Sunday, April 30th, 2006

It’s been a while since I’ve dived into Cacti, but back then I got frustrated at the lack of ability to monitor Postfix mail transportation and Amavisd filtering. I wanted it both for the practical results (know when you have an exceeding mail load, know when you have a spam attack…), and both for the show-off in it.

I wasn’t very happy then, but today I’ve managed to activate such monitoring, thanks to this link. Note I’m linking to the 2nd page. Use the first page to aquire the required Cacti template, and use the modified script in page 2 to monitor both Amavisd and Postfix.

A note – Use the script in the 2nd page even if you don’t use Amavisd. If you do use it, note the changes required in the /etc/amavisd.conf file (in my case, at least). Note that you should cancel the directive, if existing: "$log_recip_templ = undef;", so logging would be correct. Also note that you need to redirect amavisd logs to the same logs as postfix uses – in my case /var/log/maillog.

Further on with this subject – I used to redirect amavisd to a different loggin facility: "$SYSLOG_LEVEL = ‘local6.debug’;" , so I could catch the log directly from local6. In my /etc/syslog.conf I have the following directive so my logs are redirected to the right file:

mail.*;local6.* -/var/log/maillog

I’ve seperated certain log functions using the "localX" logging facility to better control the log data obtained on a loaded production server.

Power Consumption on a rack closet

Sunday, April 30th, 2006

We’ve had some issues with electricity going down on specific rack closets. We have a single 16A fuse for each closet. On some, the ones containing older Sun servers, about 12 servers were able to go into a rack closet, so we’ve had no issues with power consumptions. However, on a rack closet holding many 1U and 2U servers (PC servers, mainly. For example – HP DL360G4 (1U), HP DL380G4 (2U), Dell PE2850 (2U) etc), we’ve reached, on a 3/4 full closet, to about 17 servers. Oh, and we’ve got our power fuse jump. It appears that on an average calculation, as a rule of thumb, each PC server consumes about 1.2A in average. Meaning that on a 16A fuse, we can put about 12 PC servers, with some safe margins. We do wish to avoid exceeding 14A consumption, for better stability, for the ability to start the whole closet up at a single pass, and so we won’t have to test the fuse 24/7. 

A note – If you’re mixing power sources inside your rack closet, better mark the power sources, so you’ll know which goes where. It’s for better stability.

Fibre Cloud – Multi SAN Switch Configuration

Tuesday, April 18th, 2006

I’m not going to disclose now the method to do so, but I want to show off, so here’s a (real) screen-shot taken from my Cisco Fabric Manager. I have three MDS 9212, two Brocade SilkWorm 2800 and one Brocade SilkWorm 2400. All later three are old and slow, which is why I’ve used the Cisco MDS switches as my core switches. I’ve erased the names of the servers and the storage devices. Although it’s only my QA lab, I do not wish to disclose too many internal details. I believe this multi-path (well, mainly) environment will allow me dynamic configuration of storage ports and LUNs for the hosts, without the need to physically disconnect and reconnect fibre cables at an alternate location (such as another switch). This setup is both flexible, high-availabile, and well documented (in an Excell document I have in addition to this nice management software). That way I can track down devices per-port, per-switch, and/or per-PWWN.

Cisco Fabric Manager and my SAN. Click to enlarge

Dyslexia at HP Israel

Wednesday, April 12th, 2006

For those of you not familiar with Hebrew, this entry won’t say much. For those of you who are familiar with it, it might say more.

HP Laptops are bundled with a pack of CDs, meant for easy reinstallation/recovery on their laptops. These CDs can be used as the quickest method to recover the laptop to its original state.

Since in Israel some laptops come with Hebrew Windows, and some come with English Windows, HP has decided to allow a per-customer selection. It means you get one CD for English Windows, and one CD for Hebrew Windows. It’s only natural that on the Hebrew Windows CD, you’ll have Hebrew text.

Hebrew is a funny language. written right to left, and not the other way around. It is not uncommon for a print sourced at a non-Hebrew country, or non-Hebrew related company to be written backwards, left to right. You get the hang of reading backwards when you’re long enough in the computers business.

If HP would have supplied a CD where the Hebrew text was spelled backwards, I wouldn’t even write an entry about it. However, they did that, but added letter replacements once a while. Look at the picture:

Dyslexia!

The title in Hebrew says something like (trying to imitate the letter swaps):

"Oepratins Gysmet Mocpactdiks"

It might have been "Operating System Compactdisk", but not quite. It goes on for every other Hebrew text on the CD, getting less terrible towards the bottom of the CD. Nice, still.

I’ve just had to put it here :-)

Tweakomatic – MS do something right

Monday, April 10th, 2006

Pretend you have a medium sized environment, all Windows, and you want to accomplish one small task. For example, get the list of installed software on all computers. Great. You can sit down for a day, or two, write some huge and complicated software which will open each and every registry on your network, and compute the results, and then add it to your GPO (if you’re lucky enough to have one). During the next few minutes/hours/days, you’ll get your list.

Option two is to use Tweakomatic, MS WMI auto generation utility. It won’t do everything, but it will do most of the things you just want and just need. For a one-time run, or for a scheduled run, you’ll just need to use the easilly created script this utility has created just for you. Nice.

And they have a good sense of humor too.

Moving Exchange Data

Thursday, April 6th, 2006

Lets assume you have a method of point-in-time copy of Microsoft Exchange DB and logs, while the system is running, to an alternate server. Let’s assume, if we’re at that, that this point-in-time is consistent, and that you can mount this store (depending on using the similar directory structure, etc.), on an alternate server, and that it works correctly, aka, mounts without a problem. Scenario can be like this:

Server A: Microsoft Exchange, Storage group containing few mailbox stores, each on a different drive letter (E:, F:, G:, in our example), and the Storage Group’s logs are on a seperated drive, L:.

On Server B, we create a similar setup – Few mailbox stores, similar names, on E:, F:, G:, and we create (or move) the logs to reside on L:. We make sure this server’s patch level (or updates and versions) are similar to Server A.

We dismount the whole storage group, mark it to be overwritten by a restore, and replace the currently existing stores with our point-in-time from Server A. Great. Mounting the store, and, on a wider point of view, mounting the whole storage group’s components would be easy and painless. Our point-in-time is consistant, so it’s just like bringing up a storage group after unexpected shutdown.

Lets assume we were able to do so, we’re not finished yet. Each user’s attributes contain information pointing to the location of his/her mailbox, including the name of the store, and the name of the server. We need to change an AD attributes, per-user, for this point-in-time replication/DRP to work.

A friend of mine, Guy, has created such a script, just to solve this specific issue. It has some minor issues yet, but if you are aware of them, you can handle them quite easily. They are:

1. To run the script, make sure it is accessible via the same path on each computer running ADU&C (required only on the computers which run it). You can put it on a share, and I think it will work (haven’t tested it), or you can put it on a local directory, but make sure other computers from which you would want to run this option, have this script in the same directory (same path).

2. The script / GUI does not understand the option "Cancel", although it’s there. If you pick "Cancel", you get to actually select "0". Be aware of it.

3. The script requires resolution per OU. It means that it’s easier to move the users sharing the same mailbox store into the same OU, at least for the purpose of running the script. You could create an OU under an existing OU, and move only the users sharing the same mailbox store into it, obtaining the GPO and settings propagated to it from above.

4. There is no "uninstall" option. Don’t want it? Don’t use it. Can’t remove it unless you know what you’re doing.

I tend to believe these flaws/bugs/issues will be dealt with someday, but for the minor usage I had, it was enough, and even better.

By the way – so far, this trick cannot be used for Public Folders, as their information is hidden well too deep. Maybe someday.

I’ve been away for a week due to work abroad

Sunday, April 2nd, 2006

And had the chance to be in one of the largest server farms I’ve ever been to. Could not take pictures, though.

We were connected to a proxied and limited network, inside the organization, with a limited set of allowed web sites. It was terrible. Then I’ve figured that if I purchase wireless network connection (which was available), I can use my laptop as a router, running NAT on this connection, while still being physically connected to the internal network. Security hole? Sure is, but not mine :-)

So I’ve connected that way using PCMCIA wireless card (for some reason my internal Orinoco_PCI card refused to talk to that wireless network. I should try to find time to diagnose this issue). So I’ve had a configuration as follows:

1) wlan0 (PCMCIA wifi via ndiswrapper) – Internet

2) eth2 (First wired network card) connected to the internal LAN, and used as GW for one of my team

3) eth1 (Orinoco_PCI wifi card) in ad-hoc mode, acting as GW for another one of my team, who sat just far enough so I could not throw him a cable.

It worked, and worked fine.

Regarding other issues, I’ve noticed that my laptop was at its top, but felt it was hardly enough. For example – I used SkyPE. When in a voice call, my CPU went up to a stable 80-85% utilization. It is high. It means that if I do anything else, I get choppy sound (which I did get). It was good we’ve had such a long unused time at the customer’s place. Lots of waiting you can pass while in a voice call, and for free.

Well, it was educative. I’ve learned some additional things of how things work on a very large-scale environments, with the cons and pros of it. Was fun.