dm-multipath and loss of all paths

ByEtzion 13/05/2008

dm-multipath is a great tool. Its abilities were proven to me on many occasions, and I’m sure I’m not the only one. NetApp, for example, use it. HP use it as well (a slightly modified version, and still), and it works.

A problem I have encountered is as follow – if a single path fails, the device-mapper continue to work correctly (as expected) and the remaining path becomes active. However – if the last link fails, all processes which require disk access become stale. It means that many tests which search for a given process pass correctly even when this process becomes stale through delayed (forever) access to the filesystem. Also – tests which attempt to write/read a file to/from such a stale filesystem, become stale themselves, which can bring down an entire system (assume we have a cron which creates a file every minute. Every new process becomes stale immediately, so after an hour, we’ll have 60 more processes, and after a day – 1440 additional processes – all stale (D) and waiting for the disk to come back).

Certain detection systems actually fail to auto-detect cases of stale filesystems when using dm-multipath. This is caused by a (default) option called “1 queue_if_no_path”. I discovered that when this option is omitted, such as in the configuration below (only the “device” section):

device
{
vendor “NETAPP”
product “LUN”
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
prio_callout “/sbin/mpath_prio_ontap /dev/%n”
# features “1 queue_if_no_path”
hardware_handler “0”
path_grouping_policy group_by_prio
failback immediate
rr_weight uniform
rr_min_io 128
path_checker readsector0
}

multiple disk failures will actually result in the filesystem layout reporting I/O errors (which is good). A disk mounted through these options can be mounted with special parameters, such as (for ext2/3): errors=continue ; errors=read-only or errors=panic – my favorite, as it ensured data integrity through self-fencing mechanism.

Linux

Linux – Burn DL DVD+R media with custom layer break
ByEtzion 12/02/2009

To burn Dual-Layer (or Double-Layer, of you stick to the official name for DVD+R) medias, I use growisofs. The syntax is as follow: growisofs -dvd-compat -use-the-force-luke=break:1913760 -Z /dev/scd0=file.iso Change the break blocks to match your own values, and replace the file.iso with the actual name of your ISO file. If you do not set layer…

Read More Linux – Burn DL DVD+R media with custom layer break
Linux | Virtualization

XenServer “Internal error: Failure… no loader found”
ByEtzion 24/10/2009

It has been long since I had the time to write here. I have recently been involved more and more with XenServer virtualization, as you might see in the blogs, and following a solution to a rather common problem, I have decided to post it here. The problem: When attempting to boot a Linux VM…

Read More XenServer “Internal error: Failure… no loader found”
Disk Storage | Scripting/Programming

Microsoft Exchange, data replication
ByEtzion 27/11/2005

Here’s a little issue. If you were to replicate MS Exchange DB from one machine to another, how/what would you have done? The scenario goes as follows: You have your own domain, and you use, for your own core services AD and MS Exchange for the whole organization. While AD supports some built-in replication, so…

Read More Microsoft Exchange, data replication
Linux | Scripting/Programming

Centreon and batch-adding hosts
ByEtzion 27/04/2009

Centreon is a nice GUI wrapper for Nagios. It is using MySQL as its configuration engine, and it functions quite well. One thing Cacti can do but Centreon can’t is mass automatic addition of servers. I have had a new site with an installed Centreon, and I wanted to add about 40 servers to be…

Read More Centreon and batch-adding hosts
Clusters | Disk Storage | Linux | Scripting/Programming

Oracle RAC with EMC iSCSI Storage Panics
ByEtzion 14/10/2008

I have had a system panicking when running the mentioned below configuration: RedHat RHEL 4 Update 6 (4.6) 64bit (x86_64) Dell PowerEdge servers Oracle RAC 11g with Clusterware 11g EMC iSCSI storage EMC PowerPate Vote and Registry LUNs are accessible as raw devices Data files are accessible through ASM with libASM During reboots or shutdowns,…

Read More Oracle RAC with EMC iSCSI Storage Panics
Linux

Poor man’s load balancing
ByEtzion 08/12/2008

Using iptables to achieve poor man’s load balancing for a single server

Read More Poor man’s load balancing

Related posts:

Similar Posts

Leave a Reply Cancel reply