iSCSI persistent configurations agains us all
Using iSCSI with dm-multipath is rather common setup. With iSCSI running over Ethernet cables, which are too easy to disconnect (either on purpose or by mistake), being cheap and common technology – multipath becomes a must. If you have multiple network links, this is only expected that you use multipath for your iSCSI configuration. It’s cheap, it’s easy and it works.
This, however, comes with a price tag. Not money – the components are cheap and common, but there are configuration acts which should take place.
It is easy to find info either in the open-iscsi documentations, the Internet, whatever, and I will go over them just below, but there are some catches which one should be aware of.
Per the common documentation, unlike regular iSCSI communication, when dealing with multipath, you would like iSCSI to fail rather quickly and let the SCSI layer handle the errors, thus letting dm-multipath handle the errors and do its work.
The configuration directives are rather simple. In the iscsid.conf file (on RHEL5 located in /etc/iscsi/iscsid.conf ), you need to change the value
To a very short period of time. By default, it is set to 120 seconds, which are two minutes before anyone will notify the SCSI subsystem of any disk IO errors. A good value would be 5 seconds, which should allow for very short network disconnection (which could happen) and still – let the dm-multipath manage errors fast enough so that applications would not fail on disk timeouts.
Another two parameters which should be defined are the following (read the comments above them in the config file):
node.conn.timeo.noop_out_interval =node.conn.timeo.noop_out_timeout =
These values control the interval in which the iSCSI layer tests communication to the targets.
Also, in multipath.conf you will need to set the following feature, so that IOPs will not be lost:
features "1 queue_if_no_path"
These configuration directives can be found in these two pages from RedHat:
This is nice and pretty. However, if you have failed to do so at start, and defined your iSCSI targets based on the default configurations, you will notice that it still takes very long for iSCSI to notify the SCSI subsystem of the errors. You could check the values used by iSCSI through running the command:
iscsiadm -m node -T <target name>
Check out especially the line called node.session.timeo.replacement_timeout. Its value is the one which decides the actual behavior of iSCSI.
To change it, there are several methods. One of them is to clean up the iSCSI persistent configurations, located in /var/lib/iscsi and then re-login to the iSCSI targets. Only then you will have the new target configuration.
Check again with iscsiadm as described above, and check that this value matches.