I have had a system panicking when running the mentioned below configuration:
- RedHat RHEL 4 Update 6 (4.6) 64bit (x86_64)
- Dell PowerEdge servers
- Oracle RAC 11g with Clusterware 11g
- EMC iSCSI storage
- EMC PowerPate
- Vote and Registry LUNs are accessible as raw devices
- Data files are accessible through ASM with libASM
During reboots or shutdowns, the system used to panic almost before the actual power cycle. Unfortunately, I do not have a screen capture of the panic…
Tracing the problem, it seems that iSCSI, PowerIscsi (EMC PowerPath for iSCSI) and networking services are being brought down before “killall” service stops the CRS.
The service file init.crs was never to be executed with a “stop” flag by the start-stop of services, as it never left a lock file (for example, in /var/lock/subsys), and thus, its existence in /etc/rc.d/rc6.d and /etc/rc.d/rc0.d is merely a fake.
I have solved it by changing /etc/init.d/init.crs script a bit:
- On “Start” action, touch a file called /var/lock/subsys/init.crs
- On “Stop” action, remove a file called /var/lock/subsys/init.crs
Also, although I’m not sure about its necessity, I have changed init.crs script SYSV execution order in /etc/rc.d/rc0.d and /etc/rc.d/rc6.d from wherever it was (K96 in one case and K76 on another) to K01, so it would be executed with the “stop” parameter early during shutdown or reboot cycle.
It solved the problem, although future upgrades to Oracle ClusterWare will require being aware of this change.