Archive for August, 2011

Hot resize Multipath Disk – Linux

Friday, August 19th, 2011

This post is for the users of the great dm-multipath system in Linux, who encounter a major availability problem when attempting a resize of mpath devices (and their partitions), and find themselves scheduling a reboot.

This documented is based on a document created by IBM called "Hot Resize Multipath Storage Volume on Linux with SVC", and its contents are good for any other storage. However - it does not cover the procedure required in case of a partition on the mpath device (for example - mpath1p1 device).

I will demonstrate with only two paths, but, with understanding this process, it can be well used for any amount of paths for a device.

I do not explain how to reduce a LUN size, but the apt viewer will be able to generate a method out of this document. I, for myself, try to avoid as much as I can from shrinking LUNs. I prefer backup, LUN recreation, and then restore. In many case - it's just faster.

So - back to our topic - first - increase the size of your LUN on the storage.

Now, you need to collect the paths used for your mpath device. Check this example:

mpath1 (360a980005033644b424a6276516c4251) dm-2 NETAPP,LUN
[size=200G][features=1 queue_if_no_path][hwhandler=0][rw]
_ round-robin 0 [prio=4][active]
_ 2:0:0:0 sdc 8:32  [active][ready]
_ round-robin 0 [prio=1][enabled]
_ 1:0:0:0 sdb 8:16  [active][ready]

The devices marked in bold are the ones we will need to change. Lets get their current size:

blockdev --getsz /dev/sdb
419430400

Keep this number somewhere safe. We can (and should!) assume that sdc has the same values, otherwise, this is not the same exact path.

Collect this info for the partition as well. It will be smaller by a tiny bit:

blockdev --getsz /dev/sdb1
419424892

Keep this number as well.

Now we need to reread the current (storage-based) size parameters of the devices. We will run

blockdev --rereadpt /dev/sdb
blockdev --rereadpt /dev/sdc

Now, our size will be slightly different:

blockdev --getsz /dev/sdb
734003200

Of course, the partition size will not change. We will deal with it later. Keep the updated values as well. Of course, the multipath still holds the disks with their original size values, so running 'multipath -ll' will not reveal any size change. Not yet.

We now need to create editable dmsetup map. Use the current command to create two files: cur and org containing this map:

dmsetup table mpath1 | tee org cur
0 419424892 multipath 1 queue_if_no_path 0 2 1 round-robin 0 1 1 8:32 128 round-robin 0 1 1 8:16 128

Important part - explaining some of these values. The map shows the device's size in blocks - 419424892. It shows some parameters, it shows path groups info (0 2 1), and both sub devices - sdc being 8:32 and sdb being 8:16. Try it with 'ls -la /dev/sdb' to see the minor and major. At this point, if you are not familiar with majors and minors, I would recommend you do some reading about it. Not mandatory, but will make your life here safer.

We need to delete one of the paths, so we can refresh it. I have decided to remove sdb first:

multipathd -k"del path sdb"

Now, running the multipath command, we will get:

mpath1 (360a980005033644b424a6276516c4251) dm-2 NETAPP,LUN
[size=200G][features=1 queue_if_no_path][hwhandler=0][rw]
_ round-robin 0 [prio=4][active]
_ 2:0:0:0 sdc 8:32  [active][ready]

Only one path. Good. We will need to edit the 'cur' file created earlier to reflect the new settings we are to introduce:

0 419424892 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:32 128

The only group left was the one containing 'sdc' (8:32), and since one group down, the bold number was changed from 2 to 1 (as there is only a single path group now!)

We need to reload multipath with these settings:

dmsetup suspend mpath1; dmsetup reload mpath1 cur; dmsetup resume mpath1

The correct response for this line is 'ok'. We pause mpath1, reload and then resume it. It is best to be in a single line, as this process freezes IO for a short period of time on the device, and we prefer it to be as short as possible.

Now, as /dev/sdb is not a part of the multipath managed devices, we can modify it. I usually use 'fdisk' - deleting the old partition, and recreating it in the new size, but you must make sure, if your device requires LUN alignment, that you recreated the partition from the same start point. I will dedicate a post some time to LUN alignment, but not at this particular time. Just a hint - if you're not sure, run fdisk in expert mode and get a printout of your partition table (fdisk /dev/sdb and then x and then p). If your partition starts at 128 or 64, it is aligned. If not (usually for large LUNs - at 63), you are not, and you should either be worried about it, but not now, or should not care at all.

Back to our task.

We need to grab the size of the newly created partition, for later use. Write it down somewhere.

blockdev --getsz /dev/sdb1
733993657

Following the partition recreation, we need to introduce the device to the multipath daemon. We do this by:

multipathd -k"add path sdb"

followed by immediately removing the remaining device:

multipathd -k"del path sdc"

We need to have our 'cur' file updated, so we can release the device to our uses. This time, we update both the size section with the new size, and the new, remaining path. Now, the file looks like this:

0 734003200 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 8:16 128

As mentioned before - the large number in bold is the new size of the block device. The amount of failure groups is one (1), also in bold, and the device name is 'sdb' which is 8:16. Save this modified file, and run:

dmsetup suspend mpath1; dmsetup reload mpath1 cur; dmsetup resume mpath1

Running the command 'multipath -ll' you will get the real size of the device.

mpath1 (360a980005033644b424a6276516c4251) dm-2 NETAPP,LUN
[size=350G][features=1 queue_if_no_path][hwhandler=0][rw]
_ round-robin 0 [prio=1][active]
_ 1:0:0:0 sdb 8:16  [active][ready]

We will need to reread the partition layout of /dev/sdc. The quickest way is by running:

partprobe

This should do it. We can now add it back in:

multipathd -k"add path sdc"

and then run

multipath

(which should result in all the available paths, and the correct size).

Our last task is to update the partition size. The partition, normally, is called mpath1p1, so we need to read its parameters. Lets keep it in a file:

dmsetup table mpath1p1 | tee partorg partcur

We should now edit the newly created file 'partcur' with the new size. You should not change anything else. Originally, it looked like this:

0 419424892 linear 253:2 128

and it was modified to look like this:

0 733993657 linear 253:2 128

Notice that the size (in bold) is the one obtained from /dev/sdb1 (!!!) and not /dev/sdb.

We need to reload the device. Again - attempt to do it in a single line, or else you will freeze IO for a long time (which might cause things to crush):

dmsetup suspend mpath1p1; dmsetup reload mpath1p1 partcur; dmsetup resume mpath1p1

Do not mistaked mpath1 with mpath1p1.

Our device is updated, our paths are online. We are happy. All left to do is to online resize the file system. With ext3, this is done like this:

resize2fs /dev/mapper/mpath1p1

The mount will increase in size online, and all left for us is to wait for it to complete, and then go home.

I hope this helps. It helped me.