-
Home
- Studio
Alert Messages and Actions
You can use the Parameters and Alerts options to manage collection errors, define thresholds and severity alert levels, choose notification delivery methods and specify the troubleshooting or recovery actions you want to implement when a problem is detected on a monitored technology.
Managing Collection Errors
All Monitors make sure to distinguish failures in the monitoring target from failures in the monitoring tool. Failures and errors in the monitoring tool itself (like invalid credentials, or configuration errors) are reported in the Template's Collection Error Count parameter. With this mechanism, errors in the monitoring tool do not affect the reported availability of the monitored system or application.
However, for a few Monitors it is impossible for Monitoring Studio X to determine whether an error encountered by a Monitor should be considered as a problem in the monitored target or a failure of the monitoring tool. For these Monitors you can specify how to report these errors and failures:
- Command Line
- HTTP Request
- SNMP Polling
- WBEM Query
For example, if you are retrieving performance metrics for an application, from a Prometheus exporter with an HTTP request, you probably want HTTP errors (404, 500, etc.) to be reported in the Collection Error Count parameter at the Template level and not in the Status parameter of the HTTP Monitor, because such errors do not indicate that your application is down, but only that the Prometheus exporter is not able to provide the performance metrics.
On the other hand, if you are monitoring a Web site with an HTTP request, you will want to report HTTP errors (404, 500, etc.) and timeouts in the Status parameter of the HTTP Monitor, to clearly alert operators that the Web site is down.
Keep the Report Execution Errors in Template's Collection Error Count turned ON to report errors in the Collection Error Count parameter of the Template, or turn it OFF to report errors via the Monitor’s Status parameter.
Configuring Alerts
You can configure any numeric parameter of a Monitor to trigger alerts, send notifications and take specific actions when certain conditions are met with the Parameters and Alerts properties.
There are three alarm ranges, Alarm #1, Alarm #2 and Out-of-range, each with a minimum and a maximum value:
- Use the Alarm #1 and Alarm #2 options to define the range of parameter values that triggers warnings and alarms.
- Use the Out-of-range border conditions to be informed when the collected values are outside the norm (less than or greater than the defined range limits)
Parameters and Alerts | Description |
---|---|
Alarm #1 and #2 | |
Severity Level | To turn off the alert or set the severity level (INFO, WARN, ALARM). |
Threshold | To set the parameter threshold boundary values and the number of times the threshold must be breached before the Monitor triggers a first-level alert (Immediately or x times in a row). |
Occurrence | To define the number of consecutive times the parameter reports a value within the alarm #1 or alarm #2 range before the alert is triggered. An alert is automatically triggered upon threshold breach if the occurrence is set to Immediately. |
Alert and Acknowledgement Message | To customize the alert and acknowledgement notification messages. By default, the Monitor sends the message defined in Studio > Studio Settings. You can use Alert Messages Macros to create a message tailored to your needs. |
Alert Action | To define the actions you wish Monitoring Studio X to undertake when a problem is detected. You can refresh any Monitor of a same Template when a parameter triggers an alert. The Monitor you choose to execute as an Alert Action can help you identify and address issues before they become critical. You get the flexibility to decide what to do when a problem occurs, and even get some feedback about the taken actions. For example you can set up a chain of actions, where the Monitor of the parameter in alert triggers a troubleshooting Monitor, which triggers another repairing Monitor, etc.
Tip: Prefix the Monitors you are most likely to use as Alert Actions (for example: AA_) to rapidly locate them in the list of Monitors. It is recommended to set
Collect Schedule of the Alert Action Monitor to
Run Manually to ensure this Alert Action is executed only when a problem is detected.
|
Out-of-Range | |
To specify the range of values within which the parameter is considered to operate normally. The border range must be larger than the Alarm #1 and Alarm #2 ranges combined. Values are described as less than and greater than, for example <0 or >100. Values falling inside this range DO NOT trigger any warning, alarm, or recovery action. |
|
Severity Level | To turn off the alert or set the severity level (INFO, WARN, ALARM). |
Threshold | To set the parameter threshold boundary values and the number of times the threshold must be breached before the Monitor triggers an alert (Immediately or x times in a row). |
Occurrence | To define the number of consecutive times the parameter reports a value within the alarm1 range before the alert is triggered. An alert is automatically triggered upon threshold breach if the occurrence is set to Immediately. |
Alert and Acknowledgement Message | To customize the alert and acknowledgement notification messages. By default, the Monitor sends the message defined in Studio > Studio Settings. You can use Alert Messages Macros to create a message tailored to your needs. |
Alert Action | To define the actions you wish Monitoring Studio X to undertake when a problem is detected. You can refresh any Monitor of a same Template when a parameter triggers an alert. The Monitor you choose to execute as an Alert Action can help you identify and address issues before they become critical. You get the flexibility to decide what to do when a problem occurs, and even get some feedback about the taken actions. For example you can set up a chain of actions, where the Monitor of the parameter in alert triggers a troubleshooting Monitor, which triggers another repairing Monitor, etc.
Tip: Prefix the Monitors you are most likely to use as Alert Actions (for example: AA_) to rapidly locate them in the list of Monitors. It is recommended to set
Collect Schedule of the Alert Action Monitor to
Run Manually to ensure this Alert Action is executed only when a problem is detected.
|
Template Thresholds and Customized Thresholds
The alert thresholds defined in a Template are applied to all hosts using this Template, unless the parameter thresholds have been customized:
- through the Console view
- through the Agent > Agent Thresholds menu
- in a TrueSight CMA policy applied to this PATROL Agent
This way, you define default alert thresholds in the Template and operators and administrators can customize these thresholds on a specific instance.
Overlapping Alarm Ranges and Precedence
It is possible for alert ranges to overlap, in which case the value of a parameter may be within more than one range at the same time. The status of the parameter will be the state of the first range set following this order:
- Out-of-Range (Border)
- Alarm #1
- Alarm #2
It is therefore important to make sure the most critical range (“ALARM”) takes precedence on a less critical range (“WARN”).
Example: If you want a “WARN” event to be triggered when Processor Utilization is greater than 75% and an “ALARM” event when it is greater than 90%, you will set alert thresholds as below:
- Alarm #1: ALARM if Processor Utilization > 90
- Alarm #2: WARN if Processor Utilization > 75
As an alternative, you can set the WARN range between 75 and 90, in which case there will be no overlap. But with this configuration, a new WARN event is triggered when the parameter value goes from 93% (ALARM) to 81% (WARN). Both options are valid and have their pros and cons.
Alert and Acknowledgement Messages
You can define the way Monitoring Studio X notifies you when alert conditions are detected on a monitored parameter or when its status is back to normal. Alert and acknowledgement notification delivery options and default messages can be configured from Studio > Studio Settings and apply to all the Monitors running on the Agent.
Alert Delivery Options | Description |
---|---|
Annotation | To display a message at the annotation point of the parameter graph. |
PATROL Event | To customize PATROL Event types and related messages, by:
|
Command Line | To execute a command line on the system where the PATROL Agent is installed. |
To send the email to one or multiple recipients. Email addresses must be comma (,) or semi-column (;) separated. | |
OS Command | To execute an OS command on the Agent. |
PSL Script | To specify the PSL statement to be executed locally by the Agent. |
Log File Entry | To add a user-defined entry to the Log file. |
SNMP Trap | To send an SNMP Trap. |
A Default Message Content can be configured per class of parameters and used when no specific alert message is defined at the Monitor level. By default, default messages are provided for a selection of classes (Host, Numeric Value Extraction, String Search, etc. ). They contain basic information to provide a comprehensive report on the origin of the problem detected. These messages are used by default for all the parameters of an application class when you choose to use the Default Message property in Studio > Template/Monitor > Parameters and Alerts.
You can customize the default message content with Alert Messages Macros to keep your message dynamic and contextual. You can also delete all the provided class-specific default messages. In that case, the customizable message defined in the Other Classes option will automatically be used for all classes of parameters that are set to send a message upon a threshold violation.
Finally, you can create and customize your own default class messages to fully meet your notification policies or requirements.