Escalating Events based on Business Criticality
IBM Tivoli Netcool/Impact provides a common platform for data access that circumvents organizational boundaries. It enhances OMNIbus solutions to allow data from virtually any source, to correlate, calculate, enrich, deliver, notify, escalate, visualize and perform a wide range of automated actions.
This technical article demonstrates how information held in a MySQL database can be used to automatically escalate an alert based on the relative criticality to the business.
The EIF probe and postzmsg commands are used to generate test alerts.
Configuration Data
The data that we will used to determine the severity of an incoming alert will be held in an external database. Instructions for creating the database and sample data in either MySQL or DB2 are given below:
MySQL
mysql> create database cmdb;
mysql> use cmdb;
mysql> create table Device ( Hostname VARCHAR(255), Facility VARCHAR(255) );
mysql> create table Department ( DeptName VARCHAR(255), Location VARCHAR(255) );
mysql> insert into Device values ( 'server1','Crewe' );
mysql> insert into Device values ( 'server2','Nantwich' );
mysql> insert into Device values ( 'server3','Winsford' );
mysql> insert into Device values ( 'server4','Winsford' );
mysql> insert into Device values ( 'server5','Crewe' );
mysql> insert into Device values ( 'server6','Northwich' );
mysql> insert into Department values ( 'Engineering', 'Nantwich' );
mysql> insert into Department values ( 'HR', 'Crewe' );
mysql> insert into Department values ( 'Operations', 'Winsford' );
mysql> insert into Department values ( 'Catering', 'Northwich' );
mysql> insert into Department values ( 'Facilities', 'Crewe' );
DB2
db2 => create database cmdb
db2 => connect to cmdb
db2 => create table Device ( Hostname VARCHAR(255), Facility VARCHAR(255) )
db2 => create table Department ( DeptName VARCHAR(255), Location VARCHAR(255) )
db2 => insert into Device values ( 'server1','Crewe' )
db2 => insert into Device values ( 'server2','Nantwich' )
db2 => insert into Device values ( 'server3','Winsford' )
db2 => insert into Device values ( 'server4','Winsford' )
db2 => insert into Device values ( 'server5','Crewe' )
db2 => insert into Device values ( 'server6','Northwich' )
db2 => insert into Department values ( 'Engineering', 'Nantwich' )
db2 => insert into Department values ( 'HR', 'Crewe' )
db2 => insert into Department values ( 'Operations', 'Winsford' )
db2 => insert into Department values ( 'Catering', 'Northwich' )
db2 => insert into Department values ( 'Facilities', 'Crewe' )
Build the solution
Create a new project
– Log into Impact as the admin or other suitably permissioned user
– Select the NCI (assuming the default names have been used) server instance if you need to
– Select the Projects tab
– Click the New Projects + icon
– Enter a project name (orbEventEnrichment)
– Click OK
Create the Event Source
– Make sure the orbEventEnrichment project is selected in the Projects drop down box
– Drop down the Data Sources And Types menu
– Select ObjectServer
– Click the + icon
– Enter the Data Source Name as NCOMS
– Enter the Username as root
– Disable Backup
– Enter the hostname where the ObjectServer resides for the Primary Source
– Click the Test Connection button to make sure everything is OK
– Click OK
Create the Data Source to access the cmdb database (MySQL)
– Drop down the Data Sources And Types menu
– Select MySQL
– Click the + icon
– Enter the Data Source Name as CMDB
– Enter an appropriate username and password
– Disable Backup
– Enter the Host Name, Port and the Database as cmdb
– Click the Test Connection button to make sure everything is OK
– Click OK
Create the Data Source to access the cmdb database (DB2)
– Drop down the Data Sources And Types menu
– Select DB2
– Click the + icon
– Enter the Data Source Name as CMDB
– Enter an appropriate username and password
– Disable Backup
– Enter the Host Name, Port and Database as cmdb
– Click the Test Connection button to make sure everything is OK
– Click OK
Create the Device Data Type (MySQL)
– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Devices as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, select cmdb from the Base Label drop down box
– Select Device from the drop down box next to it
– Click Refresh, this should bring back the table fields
– Make Hostname the key field
– Select Hostname as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab
You should end up with something like this:
Create the Device Data Type (DB2)
– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Devices as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, enter Device into the Base Label text box
– Click Refresh, this should bring back the table fields
– Make Hostname the key field
– Select Hostname as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab
You should end up with something like this:
Create the Department Data Type (MySQL)
– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Department as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, select cmdb from the Base Label drop down box
– Select Department from the drop down box next to it
– Click Refresh, this should bring back the table fields
– Make DeptName the key field
– Select DeptName as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab
Create the Department Data Type (DB2)
– Drop down the Data Sources And Types menu
– Click the + icon next to CMDB
– Enter Department as the Data Type Name
– Make sure CMDB is selected as the Data Source Name
– Make sure the Enabled checkbox is selected
– In the Table Description section, enter Department into the Base Label text box
– Click Refresh, this should bring back the table fields
– Make DeptName the key field
– Select DeptName as the Display Name Field
– Click the Save icon (floppy disk) and then close the tab
Create a Dynamic Link
This will establish a relationship between the Device and Department information. In this example, Device Facility and Department Location are related i.e. if a device at a facility fails, a department at the same location will be impacted.
– Drop down the Data Sources And Types menu
– Click the Devices Data Type (the actual word Devices)
– Select the Dynamic Links tab
– Click the New Link by Filter icon – Select Department as the Target Data Type
– Enter a filter of Location = '
%Facility%
'
– Click OK
– Click the Save icon and then close the tab
Test the Link
– Drop down the Data Sources And Types menu
– Click the View Data Items icon for Devices
– Click the view linked data icon for one of the servers
– This should bring a window showing the associated Departments
Create the Policy
For this example, the Operations department is considered to be critical to the business. If any devices located at the same facility fail, we want to automatically increase the severity of the associated alert.
To do so the policy must first determine the facility of the device by querying the Device data source. Then it must find the departments at that location, this time by using the dynamic link.
Finally, each department at the location is checked. If the Operations department is impacted, the severity of the alert is increased.
– Drop down the Policies menu
– Select the Custom template
– Click the + icon
– Name the policy orbEventEnrichment and click Save
Writing code within the browser editor can be a bit awkward, especially when it comes to indentation. You may find it easier to use an other editor, such as vi, and then copy and paste the code in. To help with indentation I would recommend expanding tabs, which can be done in vi with the following settings (add to .exrc file to make permanent):
set expandtab
set shiftwidth=4
set softtabstop=4
set tabstop=4
The policy code:
,
/* Policy: orbEventEnrichment Author: Ant Mico Date : February 2009 Desc : Sample policy demonstrating so key Impact policy functionality. Loosely based on the example given in the Solution Guide (with the errors removed!). */ // Set up some variables that are used by the logging function policyName = "orbEventEnrichment"; debugLevel = 1; // Log a start up message // Note the use of a library policy which contains regularly used functions // This is a normal policy, just need to fully qualify the function to access it orbFunctionLibrary.orbLogger(debugLevel, policyName, "START"); // Query the Devices DataType // We assume that the Node field in the Omnibus alert correlates with the Hostname dataType = "Devices"; filter = "Hostname = '" + @Node + "'"; countOnly = False; // GetByFilter will return an array with the matching Data Items devices = GetByFilter(dataType, filter, countOnly); // The Length() function returns the number of elements in the array If ( Length(devices) < 1 ) { orbFunctionLibrary.orbLogger(debugLevel, policyName, "No devices found."); } Else { index = 0; While ( index < Length(devices) ) { msg = "Device " + devices[index].Hostname + " is in the " + devices[index].Facility + " facility."; orbFunctionLibrary.orbLogger(debugLevel, policyName, msg); index = index + 1; } // Now we can use the link to get the impacted Departments // Create an array in which the target DataType is stored dataTypes = { "Department" }; // Set the filter and maximum rows to return filter = NULL; maxToReturn = 10000; departments = GetByLinks(dataTypes, filter, maxToReturn, devices); If ( Length(departments) < 1 ) { orbFunctionLibrary.orbLogger(debugLevel, policyName, "No departments found."); } Else { index = 0; While ( index < Length(departments) ) { // Store the array element in a separate variable to // make accessing it easier dept = departments[index]; msg = "Department " + dept.DeptName + " is impacted."; orbFunctionLibrary.orbLogger(debugLevel, policyName, msg); // Check to see if it is the Operations department If ( dept.DeptName == "Operations" ) { // It is, so set the event Severity field to 5 (Critical) // Note that the @ syntax is shorthand for the EventContainer // variable which contains the event under consideration. // The other way to access it would be EventContainer.Severity @Severity = 5; // This next bit is important. Once the event has been // changed we must return it so that it gets updated in // Omnibus. Note the use of the EventContainer variable. ReturnEvent(EventContainer); } index = index + 1; } } } orbFunctionLibrary.orbLogger(debugLevel, policyName, "FINISH");
,
Note the use of the orbLogger function, which resides in a policy called orbFunctionLibrary. This is just another normal policy but it demonstrates how functions can be referenced across policies. This is very useful for grouping regularly used bits of code.
The orbFunctionLibrary policy:
,
/* Policy: orbFunctionLibrary Author: Ant Mico Date : February 2009 Desc : Group of functions that may be useful to other policies */ /* Function to implement standard policy logging. Takes three arguments: - debugLevel, an Integer 0 (off) or 1 (on) - policyName, a String containing the name of the policy - message, a String containing the message to log debugLevel could be extended to include more verbose information like CurrentContext() etc. */ Function orbLogger(debugLevel, policyName, message) { debugLevel = Int(debugLevel); If ( debugLevel > 0 ) { Log(LocalTime(getDate()) + " " + policyName + " : " + message); } }
,
Create the EventReader service
This is the service that will read events from the ObjectServer.
– Drop down the Services menu
– Select OmnibusEventReader from the menu
– Click the + icon
– Enter the service name as orbOmnibusEventReader
– Change the Data Source to be NCOMS
– Select the Event Mapping tab
– Click the New Mapping button
– Enter the Filter Expression Node LIKE ‘^server[0-9]+$’ so that only alerts from servers that match this filter will trigger the policy
– Select the orbEventEnrichment policy as the policy to run
– Check the Active checkbox
– Click OK
– Click OK
– The service should appear in the Service Status window at the bottom left of the page, as shown below:
Start the orbOmnibusEventReader service
– Click the start button – Click the log button to see what it is doing
– You should see the EventReader periodically querying the ObjectServer for alerts
Generate test events
Using postzmsg or equivalent generate some test alerts:
postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server1 origin=server1 Device_Down TEC
postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server2 origin=server2 Device_Down TEC2
postzmsg -f $OMNIHOME/bin/tec.cfg -r WARNING -m "Hardware failure detected" hostname=server3 origin=server3 Device_Down TEC3
You should see the alert from server3 being escalated to a Critical severity.
The log for the PolicyLogger service should contain the following types of entries:
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Device server1 is in the Crewe facility.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Department HR is impacted.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : Department Facilities is impacted.
Parser log: 2009-02-19 17:10:08.000 orbEventEnrichment : FINISH
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : START
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : Device server2 is in the Nantwich facility.
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : Department Engineering is impacted.
Parser log: 2009-02-19 17:10:57.000 orbEventEnrichment : FINISH
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : START
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : Device server3 is in the Winsford facility.
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : Department Operations is impacted.
MP.returnEvent did eri.putEvent for EventContainer: (OwnerUID=65534, Class=6601, Service=, Serial=430, RemoteSecObj=, TECFQHostname=, LocalNodeAlias=, TaskList=0, TECEventHandle=, PhysicalPort=0, NmosEntityId=0, LocalPriObj=, NmosObjInst=0, TECDate=, LocalRootObj=, EventId=, Flash=0, ProcessReq=0, TECHostname=, RemoteRootObj=, ExpireTime=0, SuppressEscl=0, ReceivedWhileImpactDown=0, InternalLast=1235063480, Grade=1, TECStatus=, Node=server3, RemoteNodeAlias=, RemotePriObj=, TECServerHandle=, Severity=5, ExtendedAttr=, StateChange=1235063480, KeyField=430, Acknowledged=0, NmosManagedStatus=0, FirstOccurrence=1235063480, ServerName=NCOMS_A, URL=, Poll=0, PhysicalCard=, NmosSerial=, Identifier=:TEC3:Device_Down, OwnerGID=0, LastOccurrence=1235063480, X733ProbableCause=0, Agent=TEC3, AlertGroup=Device_Down, PhysicalSlot=0, NmosDomainName=, NmosCauseType=0, Summary=Hardware failure detected, Tally=1, TECRepeatCount=0, NodeAlias=server3, Location=, Type=1, LocalSecObj=, X733SpecificProb=, Manager=tivoli_eif probe on carl, X733EventType=0, Customer=, AlertKey=TEC3, EventReaderName=orbOmnibusEventReader, ServerSerial=430, X733CorrNotif=, TECDateReception=)
Parser log: 2009-02-19 17:11:21.000 orbEventEnrichment : FINISH
Hits: 113