Redhat Cluster NFS client service – things to notice

I encountered an interesting bug/feature of RHCS on RHEL4.

A snip of my configuration looks like this:

<resources>
    <fs device="/dev/mapper/mpath6p1" force_fsck="1" force_umount="1" fstype="ext3" name="share_prd" mountpint="/share_prd" options="" self_fence="0" fsid="02001"/>
    <nfsexport name="nfs export4"/>
    <nfsclient name="all ro" target="192.168.0.0/255.255.255.0" options="ro,no_root_sqush,sync"/>
    <nfsclient name="app1" target="app1" options="rw,no_root_squash,sync"/>
</resources>

<service autostart="1" domain="prd" name="prd" nfslock="1">
    <fs ref="share_prd">
       <nfsexport ref="nfs export 4">
          <nfsclient ref="all ro"/>
          <nfsclient ref="app1"/>
       </nfsexport>
    </fs>
</service>

This setup was working just fine, until a glitch in the DNS occurred.This glitch resulted in inability to resolve names (which were not present inside /etc/hosts at this time), and lead to a failover with the following error:

clurgmgrd: [7941]: <err> nfsclient:app1 is missing!

All range-based nfsclient agents seemed to function correctly. I could manage to look into it only a while later (after setting simple range-based allow-all access), and through some googling, I found out this explanation – it was a change of how the agent responds to “status” command.

I should have looked inside /var/lib/nfs/etab and see that app1 server appeared with its full name. I changed the resource settings to reflect it:

<nfsclient name="app1" target="app1.mydomain.org" options="rw,no_root_squash,sync"/>

and it seems to work just fine now.

Tags: , , , ,

6 Responses to “Redhat Cluster NFS client service – things to notice”

  1. Brady Says:

    Hi,

    I have been trying to configure a simple, basic ext3 FS on NFS as a failover instance using RHCS. All i could find as help was the – nfscookbook.pdf and http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Configuration_Example_-_NFS_Over_GFS/index.html – both of which talk about GFS-NFS combination. Would it be possible for you to give me any details of trying a basic failover instance? with or without virtual IP ?

    Also, I would know what the use of a nfsclient resource is, in a situation where I donot know where/who the client is? What is the use of a NFS mount resource? How exactly is the NFS export useful when all we can provide as input for setup is a name?

    I am urgently looking for answers and am posting my questions to you in the hope of an early answer. (would definitly prefer it to be marked to my mail id too).

    Thanks much for your time and patience.

    regards,
    Brady

  2. ez-aton Says:

    You *will* need virtual IP. You will also need to setup fsid on the mounted partition. Explanation below.

    NFSclient sets the access permissions for each of your clients. It’s not just allowing NFS service to work, it’s also about who you export this filesystem to. Either it be RO for a network and RW for another network, or be it RW for everyone, etc. This is not an actual resource, but just a common method of grouping who can and who can’t.

    About the shared filesystem. NFS uses several methods. As it is quite probably that your clients will use NFSv3 with TCP, maintaining a virtual IP for the service is required. Also – since clients will use DirectIO (very common with nowadays NFS clients), without using fsid, you will most likely get “stale file handler” in the clients after several failovers. The fsid information is some sort of a definition which should go together with the NFS share wherever it goes. So clients can re-establish their connections and reacquire their file handlers easily.

    I’m sorry I do not send you an e-mail. I’m overworked, and have only littel time these days.

    You are more than invited to check out my blog anytime, or even, God forbid, subscribe to it 🙂
    Ez

  3. Jason Priebe Says:

    I’ve run into a number of issues with NFS under Linux (not specific to the cluster, although we’ve had our challenges with RH Cluster, too). I’ve summarized some of the gotchas here: http://smorgasbork.com/linux/35-linux/77-problems-with-our-linux-nfs-server

  4. ez-aton Says:

    I was able to define particular read/write settings for different hosts. The trick is to use one line. I have no idea as to how you did it, but a good example would be something like this:
    /mnt/share 192.168.0.1(rw,no_root_sqush,sync) 192.168.0.2(ro,no_root_squash,async) 192.168.1.0/255.255.255.0(ro,async) *(ro,async)

    Also – RHCS should send not only poweroff, but also poweron. It does so for other fencing devices. ACPI cancellation causes the Linux system to halt immediately, and not to attempt to shutdown in an orderly manner. This is bad (and that is why ACPI should be turned off) if the system cannot halt for some reason, like NFS stale file handlers, or other similar causes.

    Ez

  5. John Mac Says:

    Hello,
    I see that this is a pretty old thread… but I was wondering HOW to mount the NFS client resource? I always received permission denied. I created the nfs export resource/client resource configuration in luci. I have been using NFS over GFS for some time, but I was not able to mount it until adding it to /etc/exports…which I believe is incorrect. However, it seems I needed a combination of the luci setup AND /etc/exports.

  6. ez-aton Says:

    You have to create the NFS resource, and then a client resource. A client resource is not NFS client, but a definition for a specific client(s) for this particular NFS resource.
    So your hierarchy would look like:
    IP -> Disk -> NFS -> NFS Client

    Cheers!

Leave a Reply