Leveraging real-time and historic data to predict the future and optimise investment

By Simon Barnes

You would expect any availability tool to be able to inform you of current problems but how many tools can predict the future capacity needs of the resources it monitors? This aspect of monitoring is becoming increasingly important as companies struggle to lower hardware and power costs by sharing IT resources.

As typical server usage is about 20% of capacity there is scope for virtualisation and server sharing to happen, however knowing what spare capacity there actually is and finding out how much unused disk, CPU, memory or even database table space there will be in a few months time is extremely difficult.

IBM Tivoli Monitoring comes complete with a Data Warehouse that consolidates and stores data for analysis and reporting in an industry standard RDBMS database. This can be accessed through the Tivoli Enterprise Portal (TEP) or Tivoli Common Reporting Tool. The combination of warehouse data coupled with the new Tivoli capacity management product called Performance Analyzer helps you identify trends, predict system behaviour and make informed management decisions to guide future growth.

The Issues

  • Have I got critical system, database or application issues I should be addressing now?
  • What will my IT resources look like tomorrow, next week and in 3 months time?
  • What IT Resources should I worry about next?
  • Will I have enough Capacity to get me through Monday?

In this example the disk space has been forecast for 7, 30 and 90 days and a trend line has been plotted against the real data collected. It also shows that in 21 days the disk space will reach a critical threshold.

data

Main Features

By pressing the inbuilt Performance Analyzer button data1 on the Tivoli Enterprise Portal (TEP) it is possible to modify or add your trends or derived values.

Linear Trending

The Linear trending algorithm uses the Least Squares Regression method to calculate the trended value of a monitored attribute and status data such as the strength of the trend and whether the trend is moving upwards or downwards. This method is a mathematical optimisation technique which, when given a series of measured data, attempts to find a linear function which closely approximates the data.

The trended value calculates the approximate value of a monitored attribute for a given forecast period (by default 7, 30, or 90 days but these can be supplemented or modified as you wish). This output creates an x value based on the forecast period and the current time and uses the calculated function to generate the approximated value of the monitored attribute (y).

data2

Arithmetic
Also available are derived arithmetic values. The calculations use expressions and result in a single, more meaningful, new attribute. The expression may include any arithmetic function. An example of an arithmetic expression is the workload parameter used in Performance Analyzer. This is derived from the following function:

Workload = (( CPUUtilization * 6) / 10) + (( CommittedMemory * 40) / CommittedLimit )

The calculated arithmetic result is stored in a single attribute, shown in the workspace and then reported along with other useful derived parameters such as memory differential and CPU consumption (the CPU that a system is attempting to actually use) to show a concise view of a systems performance.

data3

Workspaces

Tivoli Performance Analyzer is distributed with a number of predefined workspaces. These workspaces fall into two categories: Utilisation and System Health. The Utilisation workspaces provide reports on:

  • CPU use
  • Disk use
  • Inbound and Outbound Network Traffic
  • Available Memory
  • Overview of all systems

From the Utilisation reports you can link to forecast reports for a particular managed system.

The System Health workspace provides reports on:

  • CPU consumption
  • Memory consumption
  • Workload

Sample Workspaces

data4

However Performance Analyzer has the concept of domains which allow support to be added for other types of agents that data has been collected for. Recently domains have been added for DB2, Oracle, System pSeries and ITCAM for RT.

Predictive Alerts

A situation is a logical expression involving one or more system conditions. Situations are used to monitor the condition of systems in your network. When a condition matches the specified situation, an alert icon is displayed in the Navigator of Tivoli Enterprise Portal and a further, specified, action can be undertaken.

Using Performance Analyzer you can create alerts that will warn of an impending problem in a defined period of time. For example the alert Disk_TimeToCriticalThreshold_1W predicts if a monitored disk will reach the defined critical limit within the next seven days and the prediction is strong enough (Strength = 3) to make it a valuable prediction. Strength is a derived discrete value enabling operators to quickly evaluate the strength of the trend predicted by the algorithm based on the following attributes.

  • 1 (Weak) – confidence < 50% and number of samples < 10
  • 2 (Moderate) – confidence >= 50% and number of samples at least 10
  • 3 (Strong) – confidence >= to 65% and number of samples >= 25

data5

Coming Soon…

As with any new product the first few releases don’t have all the features of some more mature products however the next release of Performance Analyzer will add some of the functionality that was missing in version 6.1.1. Amongst these promised new features are expanding the Virtualized Environments domain to support VMware, the addition of visual baselines so that you can compare current versus a defined baseline and also use these baselines to create dynamic thresholds. This will mean that only when something becomes unusual will alerts be raised.

Next Steps

If you are interested in looking at Performance Analyzer for yourself we would most happy to arrange a demonstration for you. If you would just like to discuss this or anything else then call us on +44 (0)1628 550450 or email me at   simon.barnes@orb-data.com for more information.

Visits: 12