Using nco_postmsg to monitor the TEMS server
Monitoring the Tivoli Infrastructure has greatly improved over the last few years with IBM’s MoSWoS programme however the monitoring of the actual Tivoli Enterprise Monitoring Server (TEMS) itself has always been an issue as if this server stops working all the rest of the monitors will fail.
IBM wrote a document in 2006 in which they suggested using a script on the TEMS server to monitor processes and this is probably still the best solution. This tip updates that example to use the new OMNIbus command nco_postmsg which allows you to send an alert directly to an ObjectServer. This utility accepts name-value pairs for the alert data and constructs an SQL INSERT statement, which is used to insert a new row of data into a specified database table in the ObjectServer.
You can run nco_postmsg from the command line, or you can develop scripts or automations that use the nco_postmsg command to send alerts to the ObjectServer. Multiple instances of the nco_postmsg utility can also run simultaneously.
Installation
The nco_postmsg utility is installed with the Probe Support feature of Tivoli Netcool/OMNIbus, and can therefore be deployed separately from the other Tivoli Netcool/OMNIbus features, on one or more hosts. To do this do a custom install and then deselect every option except for Probe Support as shown below.
Setup
Once this is done edit the $OMNIHOME/etc/nco_postmsg.props file and add the following lines changing the values as appropriate:
MessageLevel: ‘warn’ # Message reporting level
Name: ‘nco_postmsg’ # Name of client
UserName: ‘root’ # User to connect as
Password: ‘ZZ’ # Password for user
Server: ‘ORBD_DEMO’ # Server to connect to
Table: ‘alerts.status’ # Table to insert event
Version: FALSE # Display version information
Edit $NCHOME/etc/omni.dat and change the values as appropriate to refect your ObjectServer’s name, hostname and port.
e.g.
#
# omni.dat file as prototype for interfaces file
#
# Ident: $Id: omni.dat 1.5 1999/07/13 09:34:20 chris Development $
#
[ORBD_DEMO]
{
Primary: mgmtserver3 4100
}
[NCO_GATE]
{
Primary: mgmtserver3 4300
}
[NCO_PA]
{
Primary: mgmtserver3 4200
}
[NCO_PROXY]
{
Primary: mgmtserver3 4400
}
Then you will need to create an interfaces file by running the $NCHOME/bin/nco_igen.
Test
Once this is done you should be able to test the command and see the event arrive on the OMNIbus console.
$OMNIHOME/bin/nco_postmsg -user root -password “” “Identifier=’xyz123′” “Node=’test'” “Severity=5” “Manager=’nco_postmsg'” “Summary=’An event occurred'”
The Monitoring Script
For this tip I have used a slightly modified version from the original IBM document. This script could be improved but for now it works quite well. The script below is a shell script. For a batch version that works on Windows, see below.
#!/bin/ksh
exec 3>&1
tee /opt/yourtivoli/logs/`hostname`-MonitorITM6.log >>&3 |&
exec >&p 2>&1
i=0
j=1
Identifier=$(date +”%s”)
time=300
ProcessName=”kdsmain”
node=`hostname`
OMNIHOME=/opt/IBM/tivoli/netcool/omnibus
Summary=”The TEMS Process $ProcessName is not running on $node”
export OMNIHOME node Summary ProcessName
while true
do
checkavail=`ps -ef | grep -i $ProcessName | grep -v grep`
if [ -n “$checkavail” ]
then
echo “#########################################”
date
echo “Found Process”
i=0
echo “#########################################”
fi
if [ $i -lt $j ] && [ -z “$checkavail” ]
then
echo “#########################################”
date
echo “No Process ($ProcessName) Found, Sending Event To Object Server”
$OMNIHOME/bin/nco_postmsg -user root -password “” “Identifier=’$Identifier'” “Node=’$node'” “Severity=5” “Manager=’nco_postmsg'” “Summary=’$Summary'” “AlertGroup=’YourTivoli'”
echo “#########################################”
i=`expr $i + 1`
fi
sleep $time
done
Running the Script
On UNIX a good way to run the script is to use inittab as this will respawn the script if the monitor itself falls over.
To do this edit /etc/inittab and add in the following like:
yt:2345:respawn:/opt/yourtivoli/bin/tems_monitor.sh #TEMS Monitor
And respawn the init process.
kill -HUP 1
This will start the script and the monitoring process.
ps -ef | grep tems
And hopefully if the TEMS should ever fall over we will get an erro posted to the OMNIbus server.
ITM installed on Windows
For those with ITM installed on Windows, below is the a version of the monitoring script as a batch file. Configured with a custom task schedule to run the script this mirrors the aim of the shell script above.
Views: 299