|
| MAXTHRESHOLD | MINTHRESHOLD |
|
|
When a threshold value will be crossed the first time, the boom Agent creates an Indication and sends it to
the boom Server. After that the agent keeps silence until one of the following values crosses a different threshold level.
This suppression algorithm reduces the number of messages received by the operator as well as network traffic.
It is possible to have multiple conditions with the same severity, it is also possible to skip unnecessary severities.
Supported object filters allow to combine multiple condition sets for different objects in one Policy.
During the processing of the calculation engine, only conditions will be taken into account that matches with the object value which has been
submitted together with the monitor value. If an Object mask is not specified - all objects will be processed by the particular condition.
All Monitor Policies' thresholds are coming with Reset values. The "Reset" concept is playing an important role
in the calculation.
First of all, the Monitor Policy has a global flag called "Policy with reset".
If this flag is set to 'NO', the Policy is ignoring all reset values as well as the silence periods and it will deliver values all the time
when it is submitted to the Agent. This type of Monitors is also known as 'Continuous Monitors'.
If the 'YES' value is selected - it will be a threshold Monitor with reset.
When a monitor value crosses a threshold in the defined direction (increasing for MAXTHRESHOLD and decreasing
for MINTHRESHOLD) - it can be named "elevation". Backward direction is a "reset".
Monitors with a reset value different from the threshold value
have some special handling in the "reset" direction. The reset value gives the possibility to ignore small value's fluctuations and keeps
the reached threshold unchanged.
The reset feature can be explained with the following example:
A process CPU utilization monitor has a critical threshold = 95% indicating process high CPU load.
A normal threshold that indicates a normal state is above 0%. When the process reaches 95%, an Indication
will be sent to the server. Lets assume
the critical condition has specified a reset value = 70%, this allows to keep the
critical level unchanged until the process goes down below 70% CPU. So the deviations between 70 and 95
per cent will not reset the severity to normal.
An other example is the MINTHRESHOLD of free disk space:
A critical condition has a threshold equal to 100MB.
The reset value is 1024MB. As result of this a critical Indication like "100 MB free space left on a disk..."
can be kept active until an administrator cleans up disk space up to at least 1GB.
The Minimum type threshold requires a reset value to be bigger than the threshold, the Maximum type threshold requires the reset value to be
less than the threshold.
Another optional possibility in the condition section of a Monitor Policy is the "Ignore Reset" flag. This flag is set to "NO" by default. In case of switching the flag's value to "YES" the threshold condition becomes a 'continuous' nature. That means that on this level any submitted Monitor value generates an Indication. This can be used for more precise monitoring of critical conditions.
The "Silence Count" parameter of a condition can be used when it is necessary to suppress a couple of first generated values. The boom Agent will ignore the specified amount of submitted values that match with the condition before an Indication will be sent to the boom Server.
If an Indication is delivered with Close Mask - the server is able to automatically close related previous Indications.
The default working directory for the monitor's executable is "$BOOM_ROOT/spi/". All binaries and script calls must be specified relative to this directory. In case it is necessary to use a binary that is placed in a different location that is available in the PATH variable - use the '#' character as a prefix. i.e. #df, #top
The "Alert Finished" state will be reached when the monitor submits a value outside the defined threshold borders. In case of MAXTHRESHOLD it's below the lowest threshold value and for MINTHRESHOLD - biggest one. This state indicates the end of the previous state and enables operators to identify if a problem is still ongoing. Beside this benefit for the exception based operations concept, it also avoids the sending of Indications with normal severity.
The Indication Browser displays such Indications with Severities crossed by line:
After an alert is finished you can see in the Indication details the time stamp and last value that triggered the finished state.
Starting from v2.55 Monitor policies supports two new types: MAXONLYCHANGE and MINONLYCHANGES. These types can be used for monitoring not frequently changed values. For such types a new Indication will be created for every change detected. The policy conditions are used only to detect severity and to define an indication attributes but thresholding is not used.