Redhat Cluster NFS client service – things to notice
Friday, January 16th, 2009I encountered an interesting bug/feature of RHCS on RHEL4.
A snip of my configuration looks like this:
<resources> <fs device="/dev/mapper/mpath6p1" force_fsck="1" force_umount="1" fstype="ext3" name="share_prd" mountpint="/share_prd" options="" self_fence="0" fsid="02001"/> <nfsexport name="nfs export4"/> <nfsclient name="all ro" target="192.168.0.0/255.255.255.0" options="ro,no_root_sqush,sync"/> <nfsclient name="app1" target="app1" options="rw,no_root_squash,sync"/> </resources> <service autostart="1" domain="prd" name="prd" nfslock="1"> <fs ref="share_prd"> <nfsexport ref="nfs export 4"> <nfsclient ref="all ro"/> <nfsclient ref="app1"/> </nfsexport> </fs> </service>
This setup was working just fine, until a glitch in the DNS occurred.This glitch resulted in inability to resolve names (which were not present inside /etc/hosts at this time), and lead to a failover with the following error:
clurgmgrd: [7941]: <err> nfsclient:app1 is missing!
All range-based nfsclient agents seemed to function correctly. I could manage to look into it only a while later (after setting simple range-based allow-all access), and through some googling, I found out this explanation – it was a change of how the agent responds to “status” command.
I should have looked inside /var/lib/nfs/etab and see that app1 server appeared with its full name. I changed the resource settings to reflect it:
<nfsclient name="app1" target="app1.mydomain.org" options="rw,no_root_squash,sync"/>
and it seems to work just fine now.