Archive for May, 2008

Oracle ASM and EMC PowerPath

Wednesday, May 28th, 2008

Setting up an Oracle ASM disks is rather simple, and the procedure can be easily obtained from here, for example. This is nice and pretty, and works well for most environments.

EMC PowerPath creates meta devices which utilize the underlying paths, as mod_scsi sees them in Linux, without hiding them (unlike IBM’s RDAC, for example). This results in the ability to view and access each LUN either through the PowerPath meta device (/dev/emcpower*) or through the underlying SCSI disk device (/dev/sd*). You can obtain the existing paths of a single meta devices through running the command

powermt display dev=emcpowera

where ‘emcpowera’ is an example. It can be any of your power meta devices. You will see the underlying SCSI devices.

During startup, Oracle ASM (startup script: /etc/init.d/oracleasm) scans all block devices for ASM headers. On a system with many LUNs, this can take a while (half an hour, and sometimes much more). Not only that, but since ASM scans the available block devices in a semi-random order, the chances are very high that the /dev/sd* will be used instead of the /dev/emcpower* block device. This results in degraded performance, where active-active configuration has been set for PowerPath (because it will not be used), and moreover – a failure of that specific link will result in failure to access the specific LUN through that path, with disregard to any other existing paths to the LUN.

To "set things right", you need to edit /etc/sysconfig/oracleasm, and exclude all ’sd’ devices from ASM scan.

To verify that you’re actually using the right block device:

/etc/init.d/oracleasm listdisks

Select any one of the DG disks, and then

/etc/init.d/oracleasm querydisk DATA1
Disk “DATA1″ is a valid ASM disk on device [120, 6]

The numbers are the major and minor of the block device. You can easily find the device through this command:

ls -la /dev/ | grep MAJOR | grep MINOR

In our example, the MAJOR will be 120, and the MINOR will be 6. The result would look like a single block device.

If you’re using EMC PowerPath, your block device major would be 120 and around that number. If you’re (mistakenly) using one of the underlying paths, your major would be 8 and nearby numbers. If you’re using Linux LVM, your major would be around the number 253. The expected result, when using EMC PowerPath is always with major of 120 – always using the /dev/emcpower* devices.

This also decreases the boot time rather dramatically.

RHEL4 can see only 8 cores out of 16 cores server

Wednesday, May 21st, 2008

I have encountered it on several cases. RedHat Linux ES, by default, uses smp kernel which is limited to eight cores, or two sockets. You find out that your multi-socket hardware, with its 16 (or more…) cores show you only the first eight, both by the simple method of running ‘top‘ and then pressing on ‘1‘, or by running ‘cat /proc/cpuinfo‘.

A simple solution to the problem is to change grub so it loads the largesmp kernel at boot time, and reboot. You will get all your cores.

This is not required, for some reason, on AS server.

Graphing on-demand Linux system performance parameters

Tuesday, May 20th, 2008

Current servers are way more powerful than we could have imagined before. With quad-core CPUs, even the simple dual-socket servers contain lots of horse-power. Remember our attitude towards CPU power five years ago, and see that we’re way beyond our needs.

When modern servers are equipped with at least eight cores, other, non-CPU related issues become noticeable. Storage, as always, remains a common bottleneck, and, as an increase in expectations always accommodate increase in abilities, memory and other elements can be the cause for performance degradation.

sar‘ is a known tool for Linux and other Unix flavors, however, understanding the contexts within is not trivial, and while the data is there, figuring what is relevant for the issue at hand becomes, with more disk devices, and more CPUs, even more complicated.

kSar is a simple java utility which makes this whole mess into a simple, readable graphs, capable of being exported to PDF for the pleasure of the customers (where applicable). It parses existing sar files, or the extracted contents of ’sa’ files (from, by default, /var/log/sa/). It is a useful tool, and I recommend it with all my heart.

Alas, when it comes to parsing ’sa’ files, you will need, in most cases, either to export the file into text on the source machine, or use a similar version of sysstat tools, as changes in versions reflect changes in the binary format used by sar.

You can obtain the sysstat utils from here, and compile it for your needs. You will need only ’sar’ on your own machine.

An important note – you will not be able to compile sysstat utils using GCC 4.x. Only 3.x will do it. The error would look like:

warning: ‘packed’ attribute ignored for field of type `unsigned char…

followed by compilation errors. Using GCC version 3.x will work just fine.

Vlan Tagging with bonding network interface on RHEL4

Saturday, May 17th, 2008

This is not a simple task, as there are few things which should actually happen for it to work.

First – the switch port should support vlan tagging (of course, right?)

I have used vlan2 for “external” network, and vlan3 for “internal” network.

My configuration looks like this:

ifcfg-eth0:

DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
ISALIAS=no

ifcfg-eth1:

DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
ISALIAS=no

ifcfg-bond0:

DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes

ifcfg-bond0.2:

DEVICE=bond0.2
BOOTPROTO=static
IPADDR=1.2.3.4
NETMASK=255.255.255.0
ONBOOT=yes
VLAN=yes

ifcfg-bond0.3:

DEVICE=bond0.3
BOOTPROTO=static
IPADDR=192.168.0.1
NETMASK=255.255.255.0
ONBOOT=yes
VLAN=yes

I hope it helps anyone who is into vlan tagging over bonding interfaces.

dm-multipath and loss of all paths

Tuesday, May 13th, 2008

dm-multipath is a great tool. Its abilities were proven to me on many occasions, and I’m sure I’m not the only one. NetApp, for example, use it. HP use it as well (a slightly modified version, and still), and it works.

A problem I have encountered is as follow – if a single path fails, the device-mapper continue to work correctly (as expected) and the remaining path becomes active. However – if the last link fails, all processes which require disk access become stale. It means that many tests which search for a given process pass correctly even when this process becomes stale through delayed (forever) access to the filesystem. Also – tests which attempt to write/read a file to/from such a stale filesystem, become stale themselves, which can bring down an entire system (assume we have a cron which creates a file every minute. Every new process becomes stale immediately, so after an hour, we’ll have 60 more processes, and after a day – 1440 additional processes – all stale (D) and waiting for the disk to come back).

Certain detection systems actually fail to auto-detect cases of stale filesystems when using dm-multipath. This is caused by a (default) option called “1 queue_if_no_path”. I discovered that when this option is omitted, such as in the configuration below (only the “device” section):

device
{
vendor “NETAPP”
product “LUN”
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
prio_callout “/sbin/mpath_prio_ontap /dev/%n”
# features “1 queue_if_no_path”
hardware_handler “0″
path_grouping_policy group_by_prio
failback immediate
rr_weight uniform
rr_min_io 128
path_checker readsector0
}

multiple disk failures will actually result in the filesystem layout reporting I/O errors (which is good). A disk mounted through these options can be mounted with special parameters, such as (for ext2/3): errors=continue ; errors=read-only or errors=panic – my favorite, as it ensured data integrity through self-fencing mechanism.

A new heir for the *nix family

Friday, May 9th, 2008

While he would require some ramp-up, this is the new guy in our technical group. He is a newbie, but I’m sure he will grow to be a great technical person, putting his father way behind.

Ugly like his dad, but shows a great technical potential

i810 dual-pipe issues with power management

Friday, May 9th, 2008

I have had a problem with my IBM X41 – ever since I have started using Ubuntu 7.10 (after a nice upgrade from 7.04), whenever the lid was closed, and reopened – the display would have flickered for a short while (while the lid is up) and then blank completely.

My (ugly) workaround was to force the computer to sleep whenever it happened. It seemed to be a workaround good enough for most cases. On some cases, the laptop would do just the same as it was placed in its docking station.

I have found an Ubuntu bug here, which seems to expose this problem too. It exposed few additional problems as well. The error message I got (through SSH, of course) when viewing the logs it said that the video card detected pipe A to be the active pipe, that it stopped using pipe B (which appeared to be the internal one) and that it decided to disable clone mode. Wow. I just lost my internal LCD. Connecting an external display, I get the whole picture working just fine, however, I cannot use the laptop like that.

After a major struggle with various i810 options, I have looked and found an option to disable Power Management. I have done so, according to the note here, and it solved all my problems in this area – for now.

Dial-up in Israel through Orange 3G

Saturday, May 3rd, 2008

I have set up a small script to allow me to dial-up using my cell to the internet. The speed of the 3G connection is quite amazing, and this information would assist, I’m sure, others as well. I am using Bluetooth to communicate between my cell and my portable computer.

Steps:

1. Create an /etc/wvdial.conf with the following contents:

[Dialer Defaults]
Phone = *99***1#
Username = orange
Password = mobile54
New PPPD = yes
Modem = /dev/rfcomm1
Baud = 460800
Init2 = atz
ISDN = off
Modem Type = Analog Modem
Dial Attempts = 1
Abort on No Dialtone = off
Stupid Mode = on

2. Pair your mobile and your laptop (check it on the net). Get the hardware ID

3. Get the channel for DUN (or Dial-Up Networking)

4. Add this script in /usr/local/sbin/ (I called it “gprs”). Replace the zeros with your own hardware ID, and the number 4 (Nokia N95) with the channel you use:

#!/bin/bash
rfcomm connect 1 00:00:00:00:00:00 4 &
PID_BT=$!
echo $PID_BT
sleep 5
wvdial &
PID_WV=$!
echo $PID_WV
sleep 7
ifconfig
echo “Press on Ctrl+C to disconnect”

trap “{ kill $PID_WV; sleep 1; kill $PID_BT; exit; }” SIGINT

while true; do sleep 10; done

5. You need to run the script under “sudo”. Ctrl+C will exit and disconnect.

Good luck.

HP EVA bug – Snapshot removed through sssu is still there

Friday, May 2nd, 2008

This is an interesting bug I have encountered:

The output of an sssu command should look like this:

EVA> DELETE STORAGE “\Virtual Disks\Linux\oracle\SNAP_ORACLE”

EVA>

It still leaves the snapshot (SNAP_ORACLE in this case) visible, until the web interface is used to press on “Ok”.

This happened to me on HP EVA with HP StorageWorks Command View EVA 7.0 build 17.

When sequential delete command is given, it looks like this:

EVA> DELETE STORAGE “\Virtual Disks\Linux\oracle\SNAP_ORACLE”

Error: Error cannot get object properties. [ Deletion completed]

EVA>

When this command is given for a non-existing snapshot, it looks like this:

EVA> DELETE STORAGE “\Virtual Disks\Linux\oracle\SNAP_ORACLE”

Error: \Virtual Disks\Linux\oracle\SNAP_ORACLE not found

So I run the removal command twice (scripted) on an sssu session without “halt_on_errors”. This removes the snapshots correctly.