Leveraging real-time and historic data to predict the future and optimise investment
By Simon Barnes
You would expect any availability tool to be able to inform you of current problems but how many tools can predict the future capacity needs of the resources it monitors? This aspect of monitoring is becoming increasingly important as companies struggle to lower hardware and power costs by sharing IT resources.
As typical server usage is about 20% of capacity there is scope for virtualisation and server sharing to happen, however knowing what spare capacity there actually is and finding out how much unused disk, CPU, memory or even database table space there will be in a few months time is extremely difficult.
IBM Tivoli Monitoring comes complete with a Data Warehouse that consolidates and stores data for analysis and reporting in an industry standard RDBMS database. This can be accessed through the Tivoli Enterprise Portal (TEP) or Tivoli Common Reporting Tool. The combination of warehouse data coupled with the new Tivoli capacity management product called Performance Analyzer helps you identify trends, predict system behaviour and make informed management decisions to guide future growth.
In this example the disk space has been forecast for 7, 30 and 90 days and a trend line has been plotted against the real data collected. It also shows that in 21 days the disk space will reach a critical threshold.
By pressing the inbuilt Performance Analyzer button on the Tivoli Enterprise Portal (TEP) it is possible to modify or add your trends or derived values.
The Linear trending algorithm uses the Least Squares Regression method to calculate the trended value of a monitored attribute and status data such as the strength of the trend and whether the trend is moving upwards or downwards. This method is a mathematical optimisation technique which, when given a series of measured data, attempts to find a linear function which closely approximates the data.
The trended value calculates the approximate value of a monitored attribute for a given forecast period (by default 7, 30, or 90 days but these can be supplemented or modified as you wish). This output creates an x value based on the forecast period and the current time and uses the calculated function to generate the approximated value of the monitored attribute (y).
Also available are derived arithmetic values. The calculations use expressions and result in a single, more meaningful, new attribute. The expression may include any arithmetic function. An example of an arithmetic expression is the workload parameter used in Performance Analyzer. This is derived from the following function:
Workload = (( CPUUtilization * 6) / 10) + (( CommittedMemory * 40) / CommittedLimit )
The calculated arithmetic result is stored in a single attribute, shown in the workspace and then reported along with other useful derived parameters such as memory differential and CPU consumption (the CPU that a system is attempting to actually use) to show a concise view of a systems performance.
Tivoli Performance Analyzer is distributed with a number of predefined workspaces. These workspaces fall into two categories: Utilisation and System Health. The Utilisation workspaces provide reports on:
- CPU use
- Disk use
- Inbound and Outbound Network Traffic
- Available Memory
- Overview of all systems
From the Utilisation reports you can link to forecast reports for a particular managed system.
The System Health workspace provides reports on:
- CPU consumption
- Memory consumption
However Performance Analyzer has the concept of domains which allow support to be added for other types of agents that data has been collected for. Recently domains have been added for DB2, Oracle, System pSeries and ITCAM for RT.
A situation is a logical expression involving one or more system conditions. Situations are used to monitor the condition of systems in your network. When a condition matches the specified situation, an alert icon is displayed in the Navigator of Tivoli Enterprise Portal and a further, specified, action can be undertaken.
Using Performance Analyzer you can create alerts that will warn of an impending problem in a defined period of time. For example the alert Disk_TimeToCriticalThreshold_1W predicts if a monitored disk will reach the defined critical limit within the next seven days and the prediction is strong enough (Strength = 3) to make it a valuable prediction. Strength is a derived discrete value enabling operators to quickly evaluate the strength of the trend predicted by the algorithm based on the following attributes.
- 1 (Weak) – confidence < 50% and number of samples < 10
- 2 (Moderate) – confidence >= 50% and number of samples at least 10
- 3 (Strong) – confidence >= to 65% and number of samples >= 25
As with any new product the first few releases don’t have all the features of some more mature products however the next release of Performance Analyzer will add some of the functionality that was missing in version 6.1.1. Amongst these promised new features are expanding the Virtualized Environments domain to support VMware, the addition of visual baselines so that you can compare current versus a defined baseline and also use these baselines to create dynamic thresholds. This will mean that only when something becomes unusual will alerts be raised.
If you are interested in looking at Performance Analyzer for yourself we would most happy to arrange a demonstration for you. If you would just like to discuss this or anything else then call us on +44 (0)1628 550450 or email me at firstname.lastname@example.org for more information.