This Technote discusses two important questions that need to be addressed before starting an IBM Tivoli Monitoring V6.1 (or IBM Tivoli Monitoring Express) implementation. Who will own the tool and what are the monitoring needs?
Before starting to implement IBM Tivoli Monitoring V6.1, two important questions should be addressed: Who will own the tool and what are the monitoring needs? In this document we discuss these important considerations.
Note: The following discussion is also valid for the IBM Tivoli Monitoring V6.1 Express product.
Who will own the tool?
An organization performing enterprise systems management can be thought of as being somewhere along an evolutionary process for distributed computing. While the host arena is mature, the distributed environments are generally less mature. This is not to say that a less mature environment is qualitatively less desirable than a more mature environment; it is merely the recognition of a current state. The location on the maturity scale is recognition of process maturity, technical achievement, and financial investment.
At a lower level of maturity, you can expect the Tivoli support team to own the monitoring solution and the development of monitors for the enterprise.
An increasingly common movement among more mature enterprise customers is to turn their developed products into services. In such situations, the Tivoli support team manages the infrastructure and sets the direction, and provides training for the use of the services, besides providing support to those using the services. In this context, for IBM Tivoli Monitoring V6.1, the creation of monitors is left to the administrators and the application team, while the overall tool function and architecture is held by the Tivoli support team.
The criteria to determine the way implementation should be managed are as follows:
- IBM Tivoli Monitoring V6.1 is already installed in the environment and is stable. This includes the implementation processes.
- The administrators and the application team have the skills required for programming (that is, the general skills).
- You are prepared to educate the administrators and the application team about how to build the monitors, and provide documentation to support this.
- Precedence exists for the administrators and the application teams to “own” their own monitors, and they are ready to “own" their monitors (that is, recognize the resource requirements within their teams).
- The change control to production environment is a fairly mature process, and the application team and administrators follow the process.
- Development and Quality Assurance (QA) environments are available for the application teams and administrators to develop their monitors.
- A commitment to a QA process will be made prior to the introduction of the monitors into the production environment.
If your organization is not in a position to meet these criteria, it will be better served by developing and managing the tool in a single group.
What are the monitoring needs?
Identifying the monitoring needs is a crucial step during an IBM Tivoli Monitoring implementation. This includes recognizing the need for application monitoring for new applications; improving the existing monitoring solutions through the introduction of newer, more sophisticated monitoring; or correlation among existing and known monitoring items. This should include identification of the criticality of the request and the impact of the situation this monitoring is to address, in order to prioritize development activities across the available resources.
One fairly straightforward way to discover the need for monitoring is through the Information Technology Infrastructure Library (ITIL) concept of the Problem Management function within your organization.
When a situation occurs that is of enough significance in your organization and that could be detected via some technical means, the Problem Management function should request via the application/system owner that some mechanism be created to proactively identify and resolve the issue, identify it and alert the required parties, or take some automated action to attempt to resolve the issue before it becomes an incident.
We do advocate the complete ITIL Service-Level Management process set, including Help Desk, Incident Management, Change Management, and Release Management--processes that should be observed in the monitoring space and be recognized as part of this activity.
The second way in which the need for monitoring is discovered should be through the requirement for application management. All newly developed and purchased applications in the environment should be monitored as part of systems management to the level deemed necessary according to the importance of the system or application.
In the case of off-the-shelf software, the vendor should be able to provide error conditions for trapping, along with advice about which conditions are most critical. For developed software, the developer should build the software with monitoring in mind. This means that issuing alerts about possible problems prior to failure should be a function of the application.
Monitoring that is merely reactive (that is, identifies problems post-failure) can never be proactive in assisting the administrators of the application in preventing failure. Process-down monitoring is important, but what is far more valuable is a set of alerts that help recognize that conditions that may precipitate or precede a failure exist.
Expanding on this concept, would you rather somebody warned you about an approaching tornado or would you want to wait for it to hit your house to react? As sure as you are about your answer to this question, it is very common for administrators to say “Do not bother me until it actually goes down.” While this does assure them that there is something they can do, it comes at the expense of the service, which, at that point, is compromised.
In general, go through a planning session before designing your monitoring and dashboard views. We recommend the following approach:
- Define the objectives of the solution.
- Ensure that business objectives are stated.
- Define the scope of the solution.
- Identify the most critical applications.
Examine the existing technology infrastructure to see which needs can be met with the existing solution. If there are some aspects that will not be met by the existing solution, describe the additional monitoring pieces that have to be created. Identify the information and functions you need to design for the user communities.
Additionally, a gatekeeping function should exist in the team that develops or implements the monitoring solution. This role can be performed by the manager of the team or a business analyst within the team.
The “gatekeeper” should use the criteria described at the beginning of this section to make sure that the resources that are required for building, maintaining, and operating the monitoring solution are appropriately managed by the current request, as well as the other requests that have been made or will be made.
The criticality of the request helps drive the schedule for designing and implementing the monitoring. It is also determined by the impact of the outage or failure this monitoring addresses, and the likelihood of failure. Mission-critical systems and applications should be monitored more than and differently from the systems whose impact is low. If an event occurs every day, and it is related to a mission-critical system, that event should be prioritized at a higher level for monitoring development than an event that has never occurred even though it may be related to a significant event.
This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.