Those familiar with the Netcool product suite will be aware that Impact has included the feature for “self-monitoring” for some time. This enables Key Performance Indicators for runtime performance to be collected and analysed, for example memory usage and event queue size. Alerts are generated based on pre-defined thresholds. Similar features have now been added to the ObjectServer and WebGUI components as of Netcool/OMNIbus v7.4 Fix Pack 3 and WebGUI v7.4 Fix Pack 2. Additionally, an interactive “Netcool Health” dashboard has been added to view these and other Netcool self-monitoring alerts.
What can be monitored?
A variety of key performance metrics can be monitored for both the ObjectServer and WebGUI.
The ObjectServer metrics are related to the client connections, database table size, memory usage and top users of different resource. The WebGUI metrics include ObjectServer response times, Cache usage and JVM memory usage.
To enable the self-monitoring features requires a number of configuration steps post-installation of the fix packs. The table below highlights those steps.
|Import the new ObjectServer triggers and table||Code available on the upgraded ObjectServer in the path “$OMNIHOME/extensions/selfmonitoring”||Installation and Set-up Guide.|
|Add the self-monitoring elements to the WebGUI Data Sources XML file||XML elements documented||WebGUI Reference|
|Load the Netcool Health Dashboard||Code available on the upgraded ObjectServer in the path “$OMNIHOME/extensions/selfmonitoring”||Installation and Set-up Guide.|
|Note: At the time of writing the updated OMNIbus V7.4 documentation is only available on the IBM InfoCenter web-site and not through the replacement portal Knowledge Center.|
Thresholds for the ObjectServer are edited in the ObjectServer table master.sm_thresholds. I’d suggest any updates are applied from a custom sql file using “nco_sql”. This will ensure changes are documented and repeatable.
The WebGUI thresholds are defined within the XML elements and so are configured on update of the data sources definition file.
What do the Self-Monitoring Alerts Look like?
Both the ObjectServer and WebGUI may generate “Information” and “Problem” alters, dependent on the configuration. By default, most alerts are generated every 5 minutes, with a limited number of the ObjectServer alerts generated every 60 seconds (Connection, Client, Memory and trigger status alerts). The “Type” field of these alerts will be set to “Informational” where the threshold has not been breached.
The figure below demonstrates both Problem and Information self-monitoring alerts
The inclusion of self-monitor is certainly a useful addition to the OMNIbus features. Some of the metrics were available from the ITM6 Agent for Netcool/OMNIbus, but that agent did not collect data from the WebGUI Server, so there is a definite benefit for that component. The self-monitoring metrics are specific to the application requirements and hence do not eliminate the need for additional monitoring. Monitoring of the system and dependent applications would still be required, for example CPU usage figures for processes and metrics for the WebSphere Application Server would complement the OMNIbus specific metrics.
Is it worth the pain of the upgrade? The upgrade in a single tier ObjectServer environment was painless as was the installation for a stand-alone WebGUI server (which hasn’t always been the case for such patches!). A little more planning would be required where a multitier ObjectServer architecture is employed and the upgrades for a WebGUI load-balancing environment is more involved. However, the visibility of the KPIs provided by the self-monitor will facilitate the tuning of the servers and help to maximising the efficiency and availability of those servers…if you make the time to review those figures on a regular basis that has to be of benefit to the business?