RedHat Cluster custom Oracle “Agent”/script V1.0
Working with RH Cluster quite a lot, I have decided to create an online store of customer agents/scripts.
I have not, so far, invested the effort of making these agents accept settings from the cluster.conf file, but this might happen.
Let the library be!
Oracle DB script/agent:
Although I discovered (a bit late) that RH Cluster for Oracle Ent. Linux 5.2 does include oracle DB agent, this script should be good enough for RHEL4 RH Cluster versions as well.
This script only checks that the ‘smon’ process is up. Nothing fancy. This script can include, in the future, the ability to check that Oracle responses to SQL queries (meaning – actually working).
#!/bin/bash #Service script for Oracle DB under RH Cluster #Written by Ez-Aton #http://run.tournament.org.il # Global variables ORACLE_USER=oracle HOMEDIR=/home/$ORACLE_USER OVERRIDE_FILE=/var/tmp/oracle_override REC_LIST="[email protected]" function override () { if [ -f $OVERRIDE_FILE ] then exit 0 fi } function start () { su - $ORACLE_USER -c ". $HOMEDIR/.bash_profile ; sqlplus / as sysdba << EOF startup EOF " status } function stop () { su - $ORACLE_USER -c ". $HOMEDIR/.bash_profile ; sqlplus / as sysdba << EOF shutdown immediate EOF " status && return 1 || return 0 } function status () { ps -afu $ORACLE_USER | grep -v grep | grep smon return $? } function notify () { mail -s "$1 oracle on `hostname`" $REC_LIST < /dev/null } override case "$1" in start) start notify $1 ;; stop) stop # notify $1 ;; status) status ;; *) echo "Usage: $0 start|stop|status" ;; esac
I usually place this script (with execution permissions, of course) in /usr/local/sbin and call it as a “script” from the cluster configuration. You will probably be required to alter the first few variable lines to match to your environment.
Listener Agent/script:
The tnslsnr should be started/stopped as well, if we want the $ORACLE_HOME to migrate as well. This is its agent/script:
#!/bin/bash #Service script for Oracle DB under RH Cluster #Written by Ez-Aton #http://run.tournament.org.il ORACLE_USER=oracle HOMEDIR=/home/$ORACLE_USER OVERRIDE_FILE=/var/tmp/oracle_override function override () { if [ -f $OVERRIDE_FILE ] then exit 0 fi } function start () { su - $ORACLE_USER -c ". $HOMEDIR/.bash_profile ; lsnrctl start" status } function stop () { su - $ORACLE_USER -c ". $HOMEDIR/.bash_profile ; lsnrctl stop" status && return 1 || return 0 } function status () { su - $ORACLE_USER -c ". $HOMEDIR/.bash_profile ; lsnrctl status" } override case "$1" in start) start ;; stop) stop ;; status) status ;; *) echo "Usage: $0 start|stop|status" ;; esac
Again – place it in /usr/local/sbin and call it from the cluster configuration file as type “script”.
I will add more agents and more resources for RedHat Cluster in the future.
Nice.
Of course, I have some comments on the scripts:
1. You shouldn’t need “source .bash_profile” if you’re using su –
2. You could use RH’s “action” and “daemon” functions (sourced from /etc/init.d/functions) to use $ORACLE_HOME/bin/dbstart.sh/dbshut.sh for starting oracle/listener in one shot. Plus, you’ll get nice [OK]/[Failed] statuses.
+Katriel
Hi Katriel.
Nice to see you’re reading my blog. I am honored.
About your comments:
1. Although you are probably right, I remembered an issue with ‘su -‘ on Suse Linux and profile files, so I kept it in. It should cause no harm.
2. functions change with different versions. Under RHEL4 it was not LSB compatible (WONTFIX in their bugzilla), which causes RHCS to do weird things (stop-after-stop issues, etc). Using dbstart and dbshut could be a good idea, however, I don’t completely trust their exit codes. I need to verify its behavior (when the DB can’t start, when the DB is already started, etc) before I include that. This is, still, a very good idea. Still – I will need to monitor the listener myself.
On my next cluster (in a few days, I assume), I will incorporate and test your suggestions.
Thanks!
Dear Ez-Aton
We are using the lsnrctldb/sh script but we are getting the following error on the peer unit and the other unit is able to use this script to start service just fine. Could you kindly help please?
Jun 29 00:52:08 mailmeproddb1 clurgmgrd: [7047]: script:oracledblsnrctl: start of /usr/local/sbin/lsnrctldb.sh failed (returned 1)
Jun 29 00:52:08 mailmeproddb1 clurgmgrd[7047]: start on script “oracledblsnrctl” returned 1 (generic error)
Jun 29 00:52:08 mailmeproddb1 clurgmgrd[7047]: #68: Failed to start service:Ora; return value: 1
Jun 29 00:52:07 mailmeproddb1 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Jun 29 00:52:08 mailmeproddb1 clurgmgrd: [7047]: script:oracledblsnrctl: start of /usr/local/sbin/lsnrctldb.sh failed (returned 1)
Jun 29 00:52:08 mailmeproddb1 clurgmgrd[7047]: start on script “oracledblsnrctl” returned 1 (generic error)
Jun 29 00:52:08 mailmeproddb1 clurgmgrd[7047]: #68: Failed to start service:Ora; return value: 1
Jun 29 00:52:08 mailmeproddb1 clurgmgrd[7047]: Stopping service service:Ora
Jun 29 00:52:08 mailmeproddb1 multipathd: dm-3: umount map (uevent)
Jun 29 00:52:08 mailmeproddb1 avahi-daemon[6614]: Withdrawing address record for 192.168.20.19 on bond0.
Jun 29 00:52:18 mailmeproddb1 clurgmgrd[7047]: Service service:Ora is recovering
Jun 29 00:52:19 mailmeproddb1 clurgmgrd[7047]: #71: Relocating failed service service:Ora
Jun 29 00:52:33 mailmeproddb1 clurgmgrd[7047]: Service service:Ora is now running on member 2
Could it be that details are missing in the ‘oracle’ user’s home directory? That the oracle user is not called ‘oracle’? Try to freeze the service, or use the override trick, and then call the script manually using the root user (only) to see how it actually reacts. Make sure that you call it when either the service is frozen, or the service is in override mode, but you call a copy of the script with the override part disabled.
Ez
Dear EZ,
There is an oracle user on the standby unit but it has different UID than the active unit oracle user. Would it matter?
We are using the two scripts which is on this website. Could you kindly provide the steps to call the script manually please?
Thanks!
It’s all about the username, and not the user ID. However, there are two interesting notes there:
1. You can select to modify the script you are using, to whatever you like. No secret police to arrest you on doing that 🙂
2. RHCS submitted a while ago an agent for Oracle by themselves. I haven’t tried it (and not sure about its quality…), however, it might add some additional checks and wisdom to the process (I only check for the existing processes…)
Ez
Dear Admin,
At first, thanks for providing the scripts.. 🙂 🙂
I think the above scripts are for single database. My setup consists of 4 databases on RHEL 5.5 cluster. Please suggest to make the appropriate changes for the same. The $ORACLE_SID need to set at each time for start and stop. But inorder to retrieve the status for all the 4 databases, how i can check that. Please help..
Thanks in advance..
Regards,
Daya
That is a good question. I can offer one of the following paths:
1. Clone the script into four. Each will handle a different DB
2. Use RedHat’s internal built-in Oracle script. They provide something now, but since mine works (best? I can’t tell that. It just works on several dozens of sites…), I never bothered checking the (later-added) built-in oracle agent. However, in RHEL 5.5, it should already be there. You can give it a try and tell me how it went 🙂
Ez