Posts Tagged ‘remapped disk devices’

SunCluster, VxVM, and a system image. Sounds nice, right? No.

Tuesday, October 11th, 2005

Due to a customer’s problem, and due to the expensive investments in sending a person over, They’ve decided in my jot to ask the customer to send us a ufsdump of one of his SunCluster nodes, and we’ll just try to imitate his environments in our labs. Well, it is hardly as simple as this. The computer settings are as follows:

1) Veritas Foundation Suite (VxVM, especially) in use for the "/", encapsulated, as well as swap and /var.

2) Single node of a whole SunCluster.

I’ve tried to make it work. First, I’ve noted there’s no guide in the world
called "SunCluster Troubleshooting". You can work with the SunCluster from within, but you cannot (officially, at least) work on it from outside of it. Every document in the world is using sc* for actions on the Cluster node, however, when the SunCluster is malfunctioned the machine doesn’t boot up completely. If, like me, you have to boot the machine (Sun Sparc) using the" boot -x" flag. you won’t be able to maintain the cluster. The only docs I was able to find containing the combination "SunCluster Troubleshooting" were people’s online C.Vs.

The first part was to boot the encapsulated root slice. I’ve had to boot into CD (I use purposely broken JumpStart, which is designed to leave me with shell on the machine), edit /etc/vfstab, edit /etc/system (so it won’t map the root slice into VxVM), edit /etc/hosts (for the machine’s IP), change /etc/hostname.<something> to /etc/hostname.hme0 (due to the hardware layout), change /etc/defaultrouter to point to my own router, and remap the devices – I’ve had to manually relink /etc/rdsk/c0t0d0s* to /devices/[email protected]……/…./…@disk:a etc, etc. Dirty job, but it finally
was able to boot (using the -x flag), and left me with a crippled, yelling (about VxVM and remapped disk devices) system. Great. Now I’ve had to clear VxVM settings somehow, and recreate (and then, re-encapsulate) the root slice, and get the machine towards booting up and working. It wasn’t simple, and it took me a while to understand how to get to it, especially that vxconfigd was screaming about RPC errors, stale configuration, and was unable to perform at all. That will be added to the blog later.