I was in the Orb data office the other day and had a conversation with some Orb Data colleagues about Performance and Application Monitoring and one of them (let’s call him Miles) was wondering why companies would buy a product such as IBM Tivoli Monitoring (ITM) as opposed to Nagios or other similar open source products. The conversation went something like this:
Miles: You know, Simon. I’ve been thinking we always use ITM? There are other monitoring products out there you know and yet we always deploy ITM. What has ITM ever given us in return?
Simon: The data warehouse?
Miles: Oh yeah, yeah ITM does have a data warehouse included as standard. Yeah. That’s true.
Simon: Coverage of almost every application, operating system and storage device?
Bob: Oh yes… full monitoring coverage, Miles, you remember when we used to have to write custom monitors for almost everything. Now all monitoring can be covered by a single tool.
Miles: All right, I’ll grant you that the data warehouse and full monitoring coverage are two things that ITM has done…
Simon: And Tivoli Performance Analyzer…
Miles: (sharply) Well yes obviously Tivoli Performance Analyzer is now included as part of standard ITM deployment… TPA goes without saying. But apart from the data warehouse, full monitoring coverage and TPA…
Other Voices: Workflows… Best Practice Events… Expert Advice…
Miles: Yes… all right, fair enough best practice does save time on deployment, that’s true. And expert advice does help with resolution.
Simon: And Cognos Reporting as standard. What other monitoring product has a business analytics tool and the ability to run What If Analysis as standard?
Bob: Yeah. That’s something customers would really miss if we uninstalled ITM, Miles.
Miles: All right… all right… but apart from the data warehouse and TPA and Cognos reports as standard and workflows and capacity analytics and best practice events and expert advice… what has ITM ever done for us?
Simon: Situation correlation and Dynamic Thresholding.
Miles: (very angry), What!? Oh… (scornfully) Dynamic Thresholding, yes… shut up!
If you have ever had similar conversations as Miles then you are probably not getting the return on investment that you hoped for when you first bought ITM. For example IBM suggests that you should be looking to achieve the following benefits:
- Improved application availability between 5 & 17%
- Staff savings of up to 15% for managing IT resources holistically
- Reduction of symptomatic incidents by 40%
- Management of additional applications with existing resources – a savings of 2 HC per application
- 20% reduction in costs associated with change management
- Avoidance of HW purchases of 5% or greater
If none of these are being achieved then there is a problem. Obviously the actual savings you can achieve will depend on the size and scale of your team and deployment but they also depend on how much of the product you are using. If you only use it for simple alerts then I would suggest you need to start thinking about how you can get more for your investment.
Here are 5 ideas to get you started
1/ Use ITM for performance as well as availability
When ITM was first released it had a few issues around sampling intervals for data collection which meant that some Performance Analysts dismissed ITM as incapable of collecting the granular data that they required. This is the not the case anymore. ITM can collect data every minute and store the data in the warehouse every 15 minutes.
This allows companies to use the same tool for performance as they already use for availability which will provide real costs savings and cuts down on agents deployed and skills needed. The best news though is that the data warehouse can be used to collect any ITM metric so you can see performance data for an application such as Microsoft SQL Server just as easily as you can collect Memory or CPU.
And the database for the Warehouse need not cost you anything either. The DB2 database is included as part of the price and the agent for monitoring the database is also now included.
2/ Use Tivoli Performance Analyzer to Predict Future Performance
Tivoli Performance Analyzer (TPA) allows predictive trending and alerting on key operational metrics to demonstrate how system performance and capacity will evolve over time. TPA used to be a purchasable option for ITM. It is now included as part of the standard deployment and so you can get capacity planning alongside your standard monitoring for no extra charge.
In addition the product is improving. IBM recently bought SPSS® which is a comprehensive system for analyzing data and statistics. This is being sold as a standalone product but IBM are integrating the SPSS software into TPA so that it can predict non-linear trends. This makes TPA’s predictions even more accurate. In the example below you can see the difference non-linear trending can make to the predictions of TPA. In the screenshot the data collected is growing exponentially and a simple linear prediction (2nd graph) will not identify the issue as quickly as the SPSS integrated model (1st graph).
3/ Ensure that you have full monitoring coverage of you services
The aim of any monitoring solution should be to monitor the service from end to end. Clearly if part of the service is not monitored the service could fail without an alert being received and with many simple monitoring products, components such as the Hypervisor, SAN or Network can be missed due to lack of coverage of the product. With ITM there are no such difficulties and it should therefore be your aim to monitor all the components that constitute the service. This is needed as even relatively simple services cross many IT silo boundaries:
- Servers & Clients & Virtualization
- Middleware (Web, Database, Messaging)
Therefore if you have a service running VMware, Citrix or Microsoft Hypervisors containing databases such as Microsoft SQL Server, Oracle, Sybase or DB2 running on Linux, Widows, OS.400 or Z/Linux and connecting to a SAN such as NetApp or Brocade, ITM can and should be implemented to cover all of the service.
4/ Set up Reporting and supply regular reports to the rest of the business
If you run a Tivoli Monitoring department then keeping all the systems and applications running at 100% can sometimes not be enough. To make sure you make the rest of the business appreciate what you are doing for them you need to offer them something tangible and the best way you can do this is to provide some regular and useful reports. Luckily as Tivoli comes with an integrated Data Warehouse it facilitates real-time and historical reporting. ITM also comes with Tivoli® Common Reporting which is now powered by the advanced Business Analytics tool Cognos which provides an integrated reporting solution and allows linking multiple reports across various Tivoli products to simplify the report navigation and accelerate access to key reporting information. And because of the integration of Cognos, reports can now also be run that will predict future performance and allow for What If analysis to be created. For example you can now report on whether a VMware ESX server or cluster will cope with an additional load.
5/ Measure your User’s Real Experience
Application and component monitoring can sometimes not be enough to measure your customer’s experience of using your applications. As the diagram below demonstrates it is possible to have all the components running within acceptable limits but if all of them perform slightly below par at once the real user experience is degraded.
The ITCAM agents for IBM Tivoli Monitoring can measure the real user experience through both robotic agents and its ability to track and measure a transaction into its component parts. The screenshot below shows a transaction in ITM that demonstrates the time each hop in its path takes.
Still not convinced? Well I’m so sure that we can help you achieve an ROI that I will offer somebody from Orb Data for a day to audit your current use of ITM (or Netcool) to help you get better use of your monitoring tools so that you get the return on investment that ITM is capable of giving. To arrange this send me an email at email@example.com